Hello everyone, I just joined the email list and would like to introduce
myself.
I'm designing a global file system with permanent caching and location
independence. It has integrated caching, and was specifically designed for
performance.. What's unique about it is that it appears to be a transparent
method, and I see no reason why it cant end up emulating NFS and CIFS, while
increasing their speed. Basically, MFS separates a file into 2 fundamental
units: information and data. All of its features a byproducts of this
method. Distributed cache concurrency is achieved via cryptology and a
mathematical identity function. The essence of this approach is illustrated
here: www.mercuryfs.net/design/fig_9.pdf. The file separation design is
illustrated here: www.mercuryfs.net/design/fig_8.pdf, note that the
information record is fixed in size. Because of the decoupling of a files
information from its data, a file update can be pushed/broadcasted via a new
information record. The data records utilize pull replication, via the new
Information record, only those that want the data record will request it.
Because of the decoupling, the provider of the data record is irrelevant.
This is because data records are accessed by their math values, and only the
correct data record will have the same math values. When an MFS file is
accessed, first the URL is resolved to an MFS handle, then the data is
requested via the MFS handle. The handle is built for an identity function.
(see top of FIG 8 for MFS handle formula, note it can be extracted from
either the information record or data info record) The handle is created
from the URL and its data. In this respect, 2 identical files will create
the same handle. That's the identity function. The MFS handle is a globally
unique identifier. Since it requests data via MFS handle, all routes that
the request follows can eavesdrop on the request and/or the data that flows
thru it. Imagine that I'm in California, and I request something from Japan.
Hawaii eavesdrops on the request as it passes by. Now, Hawaii may have the
file, but it doesn't need to do a search, figure out it doesn't have it,
then pass the request on, thus incurring the delay of the search. Instead,
it immediately passes the request, then as the data starts to flow from
Japan thru Hawaii, Hawaii can THEN figure out it has a copy, and
TRANSPARENTLY continue the data stream instead of Japan, thus saving the
traffic between Hawaii and Japan. To see a similar example for this, go to
www.mercuryfs.net/design/fig_39.pdf As low priority, hawaii can begin the
search when the request first flows thru it, and can have it completed after
japan has started sending its data. If a large movie is being watched in new
york, from japan, over time all the links transporting the movie will have
searched if they have it, in a group effort to conserve bandwidth.
Remember that the MFS handle is a function of the URL and data, therefore
its NOT assigned by the file system. Instead, its assigned TO the file
system, by the file. THAT is a very important distinction between MFS and
all the others.
I know this email list is about performance measurements. The goals of MFS
are performance, via content acceleration and a new method of caching, which
allows network files to remain resident locally. Since MFS can save a lot of
bandwidth, I figured this would be the right group to talk to. I think you
guys will like this design.
The web site is www.mercuryfs.net
If you guys could direct me to any other projects that have tried this
method and failed, I'd appreciate it. I've been looking hard.... Because so
far MFS appears to be unique. It differs from AFS, CODA, Ocean store,
Freenet, and NFS quite a bit. But I know others must have taken this
approach before....????
Comments are welcome, I just took the design public this week.
- josh
This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:19 MST