Er, DAS
Are you running some sort of redundant cluster file system or something? With a SAN or filer, if one of my ESX servers dies another can just restart the VM within a minute or two. How do you do that with DAS?
Why do we need SANs any more when many virtualised app and storage controller servers and virtualised storage can co-exist in a single set of racks? Storage area networks (SANs) came into being so that many separate physical servers could each access a central storage facility at block level. They each saw their own LUN ( …
So i will be able to use the virtual machine management software to allocate virtual storage servers. That is nice. But what if I need more than on physical storage server ? With only DAS, each will have its own set of attached disks and there is nothing the virtualization software can do to repartition the disk space between them.
So why not attach a single set of disks to all these physical storage servers using ... a NAS ?
So all the power of the SAN, to do snap copies, multi path redundancy and so on is replicated in a NAS. I don't think so.
We moved from direct attached storage some time ago, and never looked back. We had nothing but issue with Direct Attached SCSI disk arrays and the like, but the SANs we have have been fault tolerant and far speedier.
Don't get me wrong NAS, Direct Attached storage and so on, have their places, but a SAN serves a specific purpose in an enterprise (even a small one like mine).
I am a little confused as to why this article has been written in the first place. We use Falconstor products to allow us to virtualise our storage so that we can provide active mirroring across multiple sites using our SAN solutions. This provides us with near 100% uptime as we able failover into each site in real time. Does DAS allow you to do this?
Which is that many systems are now bottlenecked by their access to disk, this is fixed for anyone with local disk by buying solid state storage, the problem for the folks selling overpriced hubs (Cisco, Brocade etc) is that their "network" simply cannot come close to the performance of a couple of local SSD disks on each node. Also, as users start to put SSDs into their expensive bigiron they will notice that that they bottleneck is the SAN fabric and things aren't getting much faster.
This creates a real problem for those pushing traditional big iron SANs and trying to make data centres look like Jurassic Park. If users have to use local SSD cache to keep the data access speed up to the point where modern servers can be used without choking on the SAN bottleneck then they have to manage the data on the local SSDs too. You can't just cache the SAN data on the SSD because some other node might need the data and one of the advantages of SAN is the ability to attach multiple nodes to one set of data (or recover by assigning a new node in place of a failed one, which won't work if the failed node has the up to date data on a local SSD). This means managing the cache coherency between SAN and client SSD, this is not something the centralised SAN architecture was intended to do, it is also not something it will ever be good at doing.
The fatal issue for Fibre Channel is scalability, it can't scale to the performance of SSDs so even if we do spent our next year's EBITDA on a single giganto-SAN it won't scale to serve all our nodes.
There may well be a significant role for a more distributed storage system using 10GE or similar and providing local boxes full of SSD and cache memory to serve groups of server boxes over a local flat network. These smaller local data proxies would then have to use some smarter software (not overpriced hardware) to replicate back to a central store for backup and other shared services.
So wait, a 10gigE port shared between protocols and with FAT IP overhead will solve the bottleneck issues of 2x8gig FC?
Yes I've seen slow SANs - but that's a design question, and why you should have designers.
If you think you can solve any big problems with a little SSD caching or - lets call it distributed and localized - storage tiers then this might scale to serve up all your servers. but for the ones needing performance, it will suck.
Much like virtualized servers, which does fine for 98% of the windows boxes, but once you migrate your big apps onto it you suddenly see users ordering the largest non-virtual iron they can get.
did I mention uplink and backbone bandwidth issues?
I think you guys need to get some reality or you drift up into the clouds.
I'm building distributed storage over DDR infiniband and I'm also a big fan of i.e. amplidata how also have a nice idea of local ssd caching. And when purestorage does some more selling, we will see a whole new level.
just this stuff is new, unproven commodity solutions.
Yes your performance will degrade if you got a by-the-book san and consolidated 150 servers onto one EMC. But that's not an FC issue. It was asking you asking for a bottleneck.
A SAN is a *NETWORK* and so it might be helpful to use it like one, with more storage systems, and not sending every unrelated bit of data through the core.
Just as it semt so natural to use distributed commidity crap storage with SSD cache, why don't you consider avoiding bottlenecks in your SAN?
Bonus: The SAN is really fast when you need it. Always, not just when you got the data cached and coherency is not needed.
For the database layer, we're doing this already with Exadata: there's no choice to have external storage, but you can plumb racks together by joining the Infiniband domains together.
At the server layer I'm with Chris - a Cisco UCS installation has all the I/O usage from potentially tens of servers aggregated at the Fabric interconnect. You then configure two or four or six uplinks into the storage fabric - you could put this directly into a storage subsystem and, if you don't, you're massively reducing the number of connected physical ports on your SAN.
Of course, manging all the virtualisation and abstraction layers places new requirements for organisation and discipline on the system administrators.
This is possible today using DRBD and any cluster filesystem in Linux. However, DRBD only scales to 2 storage nodes. An emerging solution will be to use the ceph filesystem or its underlying rados (rbd) layer to scale out a large cluster of hypervisors which are also block storage servers.
A viable "enterprise" solution for something like this today is to use IBM's GPFS along with KVM or Xen in each compute/storage node. The result would appear as a compute cloud tucked inside an IBM SONAS solution.
You've missed one key part of the story. The virtual storage software, such as the LeftHand Virtual Storage Appliance, can combine storage from multiple physical servers into a single pool, allowing you to to grow LUNs across a number of boxes. If you need to grow a virtual LUN just add another server to the pool.
It may happen that author missed intentionally or not very large segment of systems where anything besides SAN is nonsense. Hundreds of terabytes are not just terabytes. They need a little bit more attention during backup, restore, high availability, performance... Numerous scenarios completely missed. That would be return to age of island computing. There are tens of billions worth systems around that still do not support latest technology developments. Article is more a joke, than serious analysis. Would any put its job, house and car payments, vacations, children scholarships on such systems?
I'm working on a project at the moment with circa 100 Windows VMs (50/50 in Prod/DR) and over 100TB of disk, we need kocking on for 10 maxed out HP proliants (DL380s) per site and we only use a fraction of the disk available from the array (An EMC VMAX) The main limitation that we're finding is the hypervisor just can't hack the IO. We see the same with RHEL as well.
That is why we need a SAN, virtualisation is great, but it still not got mainframe sized hardware behind it. Come to that, our mainframes share disk arrays.
Just curious as to why you'd have a DR VM and not use a movable machine? Or do you mean half run in your prod datacentre and half in your BCP/DR datacentre which would seem more likely?
If not, maybe I'm missing something but I'm pretty sure my DR is my Prod VM flipped to the BCP hosting (or alternative node in prod cluster) automatically. Machine is SAN stored and replicated.
Half in one DC half in the other.
Data are SRDFed to the DR site and filesystems are mounted and services started at the DR site when required after splitting SRDF, that is... We do have capacity to vmotion machines round in each site, if required.
We're working on Metro clustering to automate failover between sites, but currently it's generally prefered to manually run scripts due to the complexity of the systems and the consequences of only part of the system failing over. Also we need to know when failovers occur to make sure that the backups run in the correct site to maintain data offsitedness.
Look for "stub IO domains".
The research folks looked into the IO performance issues some years ago, I.e. with infiniband in domUs it had been proven to get native IO rates (so multiple GB/s).
Unfortunately noone bothered to do any implementations in the real world, so this is well-researched, solved, and fails.
On ESXi it is secret sauce to tune a VM to 300-400MB/s IO rates and beyond that you can forget it.
On KVM?
hahahaahahahahahahaha
FC and iSCSI both depend on virtualizing storage over networking on a separate software layer.
More and more the trend is to eliminate the network layer and go to straight SAS.
8 port, 2 connector SAS 6Gb HBAs are now around $400.
6Gb SAS switches now exist.
12Gb coming soon.
a) Ignore blades. Blades are not the answer. The cooling required to host blade servers isn't available at the data centres most of the market can afford, and the compromises you need to make in terms of single thread performance make them poorly suited for many tasks.
b) We have disks. We have servers. We need to be able to move hosts between physical nodes in the event of failure, or for maintenance. This means the same storage has to be available to all physical nodes. This means we need.... oh wait, we need a SAN.
All this fancy stuff you're trying to push is either a way of rebranding DAS, which is useless if you want the ability to migrate hosts across nodes in any sort of reasonable timeframe, or a way of rebranding a SAN.
You can have fibre channel, iSCSI, FCoE, all in different speeds, and all able to fulfil the IO requirements of pretty much any single node (iSCSI admittedly suffers from somewhat higher latency and overheads than the other two). If you run into an interface bottleneck just upgrade it, or bond several together. 40Gbps ethernet arrived recently - if you need more than 40Gbps out of a single storage array you're doing something badly wrong, and need to re architect your environment.
SANs are not evil, they are not scary, they don't cost the earth. We aren't desperate to get rid of them. For the most part, they just work, they are incredibly flexible and very easy to scale in any direction. These products are marketing drive trying to sell a solution that doesn't solve any problems.
Oh and by the way - in this world things like SAS interfaces aren't even close to fast enough to provide interconnects between hosts and arrays, or to connect arrays together. They are fine for a few disks in an array, but for more than that they are useless. The interfaces we have right now work fine, thanks.
Expect to see DAS environments comprising many servers, thousands of drives, multiple storage personalities, all managed as a single 'blob' of storage and server compute power delivered at commodity pricing. A mainframe made from cheap bits, you could think of it as. If you can have 1000's of drives in a DAS environment, and all the resilience of functionality of an enterprise storage array, but at commodity pricing then why the hell would you need a SAN? And there is only one company that has the secret sauce to do this!! Bit of a clue there!
The DAS is shared across multiple servers. You're all still stuck in the mindset of a few internal drives in a tower! Think 1000's of drives, any server can see any drive. All the benefits of a SAN but without the cost and complexity of a SAN! 2000 drives behind 16 storage controllers (servers), each with a personality, or even multiple personalities, and the ability to provision, reprovision, change personalities, etc. Every server can access every drive. No little islands of trapped DAS. Boom!
Behold the power of a FC loop setup
Which was found to be really stupid somewhere around 2000.
Oh yes, you can do Lun masking in SAS
Which is another thing that has been basic FC functionality but prove to be too tedious for real usage.
So you could kinda add zoning now
And VSANs
And SNIA apis
Then it would be easy to manage. And if you also add better routing and redundancy and QoS then you'd end up with FC. Ah, no. Forgot NPIV.
FC is not complicated for storage admins, it's easy and reliable.
And you're trying really hard to believe you're not just creating, then solving the same problems another time.
How will scalability work for you if you move beyond a smallish setup?
160 storage controllers? 320?
How much metadata do they have to send around?
I'm really open to new stuff, but it should not be old bullshit.
If you have a hardware accelerated NAS like, um, BlueArc make then you really don't need to muck about with all this. Have one centralised storage infrastructure that can service your CIFS & NFS clients as well as providing iSCSI if you need it. You can also use the accelerated NFS to do your VMWare without all the complexity of a SAN and the bottlenecks of a NetApp filer.
And we all know where that took us. Massively over inflated costs, dependency on a single vendor who could keep us hostage for time to come with very specialized technical knowledge needed once the thing started falling apart. Have you ever priced a disk inside such a vendor monster with the cost of buying it separately? That is the crux – it is not a SAN, it is not a DAS, it is not a File Server – it is a cobbled together landgrab by vendors to try they drag us back to the middle ages of computing.
By the way – isn’t this what the Oracle Exadata is supposed to deliver? If not – some of their sales guys are telling fibs!