After knocking SSDs for poor performance, Fusion-io is now building one itself – but throwing out existing speed-limiting SSD interfaces designed for disk drives. A preview device, running at 95,000-plus IOPS, was shown at HP's Discover event in Vienna last month. Fusion-io has criticised SSDs for poor performance, saying: …
isn't faster & faster data access just going to lead to code being even more bloated and sloppily written than it is today?
after all once your code produces the result needed consistently and in an acceptable time how many people go in and optimise it further?
That's not going to stop vendors from wanting to sell ever faster kit in any way, though. Quite to the contrary. Figure out a way to produce better code with less effort and you can outperform the competition on cheaper kit.
T10 is working on a SCSI over PCI Express standard. It's just another transport for SCSI. They abbreviate it as SOP.
Missed the boat
I thought the enterprisey solution to databases these days was just to put it all in RAM?
WIth RAM prices dropping like a brick at some point it's cheaper than fancy enterprise grade PCIe bussed SSD
You may consider it to be of some interest to keep that RAM going if the power supply blows, so it cannot be the standard RAM interface. Battery-backed SSD though? Why not.
Flash-backed RAM, in large quantities. What do you mean, PCIe, SATA, what-have-you?
Still and all, something to tide us over between there and now probably will have the market for its day.
Another one of these, or related?
Dieing a Slow Death
The existing SSD market won't lay down and die anytime soon. At least not until SOP gets itself integrated on motherboards in order to remove the extra hardware cost to the consumer. This could very well take over the corporate sector though.
"Abandoning HDD interface paradigm for SSDs"
I might be naive... I'm just a lowly system level developer with a background in electronic design as a hobby, but.... Oh, I also released for review (didn't publish it properly, but it got around) the design of a new RAID system which a year later was shipped as BeyondRAID (without giving credit where credit was due). Of course, that's fine since I really designed it and released it so I someone else could implement it and I could buy one :) would have been nice of them to toss me one for free though... after all, I spent a year designing the system on napkins.
IDE was basically an ISA bus interface for hard drives. It moved the physical controller from the adapter card or motherboard to the drive itself. The only real task which the controller card was there for was basic addressing support. It trimmed the 16-bit address bus to a 3-bit one.
What's quite funny is that for the most part, the IDE interface stayed almost 100% compatible with the original MFM controller style of addressing drives for a very long time. This was quite different from SCSI which overloaded the BIOS disk I/O interrupt (the ancient alternative to a syscall interface) to map to a SCSI vendor specific BIOS. IDE didn't need this since the IBM AT BIOS already understood the I/O interface of the MFM controller cards. The biggest change to the spec for years (at least until we needed to try to abandon the physical addressing method in favor of LBA) was support for the drive to report its own information for auto-configuration.
Most of what we call an API to the SATA drive interface today is actually mostly based on the original MFM controller API specification with the few changes that the drives are no longer addressed physically, the drives provide feedback about their state and the drives no longer reside at fixed memory addresses. Oh, there are extensions as well for things like NCQ, but in theory, using a SATA to IDE ser/des, an IDE to ISA adapter (not controller), an IBM AT at 6Mhz should be able to address at least the first 128 megabytes of the latest SATA-II drive on the market.
This step was actually a logical step back to our roots with a few exceptions.
1) Connecting more than one drive to a single PCIe slot will require either sharing the PCIe lanes of the slot across the connected drives OR a PCIe switch will be needed to create more lanes for the traffic.
2) RAID will have to be performed in software. When increasing the performance of the drives to near RAM speeds, the CPU will just not be able to keep up with the massive amount of XORing involved. zRaid is possible, but ZFS is not very flexible unless working in predefined bank sizes.
3) Enterprise computing will have to wait for a cross between an XOR engine and switch for RAID to come around. I'm sure there's a market for it... but it's almost certainly not the right way to do this.
A long ass time ago, it was either maxtor or micropolis (hard drive companies which I think are both now absorbed by Seagate) came out with a nifty concept based on SCSI. What it was, was a stacking hard drive subsystem. You'd buy a base stand which connected to the PC via SCSI-I (it was just called SCSI back then as there was no SCSI-II or later) and then using a simple shifting mechanism, each drive which would be placed on top of the stack (up to 7 of them) would find their ID from their position in the stack and add themselves to the bus. The system was nothing more than a plastic box with a fan, a hard drive and a connector. But the idea was perfect. Still have no idea why it never caught on after. I have since designed systems which would use a SAS cabling to do something similar for up to 4 SATA or SAS drives, but I lack the skills to route the 6Gbp/s signals and haven't paid anyone to do it.
The system was absolutely perfect though. So, the best thing to do here is to stop caring about the existing hard drive form factor. You can always make mounting brackets to make them fit in drive bays. But instead would be to extend the PCIe bus using a 8x or 16x cable from the motherboard to a chasis. Then, mount a PCIe switch on the chasis main board and extend 4 to 16 PCI channels to each storage device stacked on top of it.
Then, configure design a new Flash SSD controller which would allow each device to act as
a) a plain old drive
b) an active normal member of a RAID which will either write data directly or maintain an XOR of the data of the other devices for the given stripe.
c) a spare member which will remain idle until it is needed.
The drives should be inserted into a cage using a backplane. And to eliminate need for fancy cooling solutions, should not have a case, they should be a raw board. This way the SSD RAID can either be submerged in a coolant or can be cooled using a simpler fan.
The result would be an extensible redundant device with absolutely insane speeds (the addition of each drive would increase the performance of the RAID linearly) and it would be extensible so long as there's more room in the PCIe switching fabric. Oh... for RAIDing, it might be necessary that the switch doesn't purely switch so much as convert PCIe unicast to PCIe multicast so all drives can listen. Although a more intelligent method would define a multicast for each of the drives and then multicast the bursts to the devices whose listening for data within the given stripe.
It's not the SSD paradigm that needs breaking. It's the disc storage paradigm. The one that places active data onto a poorly-connected interface and stores it in a funky way.
Flash is (non-volatile) memory. It should be directly addressable. Even if the storage words are block-sized. We don't need no steeking "filesystems".
NUMA is the way to handle it hardware-wise.
Software-wise; clean up your act and treat all memory as "virtually non-volatile". Flash is fast enough to "mirror" memory pages. Let the operating system pull the pages from NV to volatile RAM... which would conceptually be no more than an "L4"-cache.
That's a paradigm change. Such changes aren't done by nibbling along the edge of the slice of pizza; one has to stuff the whole thing in the gob and chew like buggery to get it down.
Bernd has a valid point...
And cheesy should look at ZFS (zfsonlinux.org) to see if he can help out.... *hint*
I personally doubt that SATA interfaces will be around much in 5 years. Flash in a HDD form factor was always going to be transitional in any case.
The hard part of this particular change will be unpicking ~60 years of accumulated filesystem and hard drive wisdom (much of which is based on netownian physics, not electronics). Programatically a VFS interface layer is likely to be around forever.
Note to the author : Could you please stop with the stupid mid-article panels ?
I know some expensive magazines tend to think it makes them more intellectual or something, but really it is nothing more than a nuisance and a distraction.
Especially when the quote inside has absolutely no relevance whatsoever to the text that is currently being read.
If you really have to keep using those stupid mid-article boxes, at least use them as they are supposed to be used, i.e. to make an underline a phrase that correctly summarizes the current section of the article.
For example, a more interesting sentence would have been "Connecting via the PCIe-based SCSI Express standard is consistent with Fusion’s cut-through architecture".
But really, I would prefer that it disappears. It is not standard Reg layout and it is not something I like seeing.
- US Treasury to launch pre-emptive strike on EU's Ireland tax probe
- French, German ministers demand new encryption backdoor law
- Both HPs allegedly axed people just for being old, California court told
- IBM will move stored stuff onto its new flashy boxen for free
- Irritable Cisco kicks Nutanix out of partner program