back to article SGI plunks Windows on big Altix UV supers

High-end cluster and shared-memory supercomputer maker Silicon Graphics radically and quietly expanded its addressable market on Wednesday, announcing that it has certified Microsoft's Windows Server 2008 R2 operating system on its line of Altix UV systems. The high-end Altix UV 1000 systems are based on Intel's Xeon 7500 …

COMMENTS

This topic is closed for new posts.
  1. Paul Crawford Silver badge
    Thumb Down

    Sad but predictable

    Using a massive cluster "to parallelize and execute Excel workbook macros" makes sense to those who don't consider using proper math/statistic tools for the job.

    Yes, you may be able to re-use 'business skills' but considering the problems of validating complex spreadsheets, poor data typing, the lack of proper program structure, version/change control, debugging issues, etc

    My heart sinks at the idea of it.

    1. Lior Paster

      It's not a cluster...

      It runs a single OS image on all 256 processors (2048 cores) and the whole memory block. It's like a HUGE desktop... It doesn't need an OS instance for each server like a cluster would.

      Single OS makes life VERY easy for administration and management. One memory block and all CPU's under one OS is a HUGE thing for some application that DON'T scale on clusters!

      It's a standard x86 box running standard Linux. That's a huge thing! who wants to deal with Power or Itenium or Unix??? everyone now wants to use standard OS and off-the-shelf software and IO, and SGI can run STANDARD x86 64bit SUSE and RHEL and Windows on UV, and that's the way to go because these are simple, standard operating systems of the future.

  2. Anonymous Coward
    Stop

    8 Socket Xeons Max Memory - fact check...

    >> With eight-socket Xeon 7500 machines petering out at somewhere around 512 GB to 1TB of main memory

    Hmmm - the HP Proliant DL980 already goes up to 2TB, and I think higher density DIMMs are only a few months away...

  3. Kebabbert

    Just a cluster

    This server is just a cluster. IBM and Sun and HP has for long made huge servers, with as many as 64 cpus. There is no chance that a Linux machine has 256 cpus. It is just a cluster, and behaves like a cluster. See here for more information:

    http://www.c0t0d0s0.org/archives/6751-Thank-you,-SGI.html

    1. Roger Heathcote 1
      Headmaster

      Bollocks

      You are talking out of your arse Kebabbert.

      It was SGI who had to persuade Linus and the LKML to increase the value of maxsmp (maximum number of cores) in the linux kernel to 4096 precisely so they can make machines with 256 processors. SGI make machines that can run a huge number of cores in a single system image - I have seen it with my own eyes. Owing to the changes above they can even run off the shelf linuxes with stock kernels with a single massive block of memory so you're talking utter rubbish.

      That there exist some workloads in which this architecture performs no better than a clustered solution is nether here nor there. Clustering is a fundamentally different paradigm to this - you try initializing an array of a 1000 x 1000 x 1000 bytes on your cluster and see how far you get!

      What we are talking about here is supercomputing and you, sir, are a numpty who needs to shut up more often in public.

      1. Kebabbert

        @Roger Heathcode 1

        Before you reveal your ignorance, maybe you should think about what you say? Big Unix servers such as huge Solaris servers, AIX servers, HP-UX server, etc have for decades scaled well, and have at most had something like 100 cpus. Today they have 64 cpus at most. Do you really think that a toy OS such as Linux comes from nowhere, and suddenly scales to 1000s of cpus in no time?

        It takes decades to scale well. Why are there no Unix servers with 1000s of cpus today? Why does IBM not offer such servers? Answer: IBM cant. They can not get their mature stable Enterprise Unix to scale that much, after decades of work. IBMs largest server offers 32 cpus today. Their biggest mainframe offers 24 cpus. The biggest Solaris servers today has 64 cpus.

        It is ridiculous to believe that a unstable toy OS with low code quality, in a coupe of years scales from 2-4 cpus to 1000s of cpus. It just does not add up. If you think a bit. The answer is that SGIs machine is just a huge cluster. Do you need proof? Read that link. The SGI machine has very impressive benhcmark, but it is just a clustered benchmark.

        I suggest you start to think a bit before spewing out nonsense. I have a golden bridge I can sell to you, you want it? Do you really believe everything you hear, without critical thinking?

        1. Anonymous Coward
          Anonymous Coward

          "unstable toy OS with low code quality"

          Quite a number of people here talk about things they don't understand, but you seem to have reached degree standard.

        2. Chemist
          Flame

          Suggest you ..

          read what SGI say about it.

          http://www.sgi.com/products/servers/altix/uv/

          " whereas Altix UV 100 and Altix UV 1000 are all about scaling a single system image to a maximum of 2048 cores"

          "Because of SGI's collaboration with the Linux community, Altix UV runs completely unmodified Linux, including standard distributions from both Novell and Red Hat"

        3. Chemist

          Oh and by the way

          Your comment "unstable toy OS with low code quality" REALLY shows up your lack of knowledge

          That's why Linux is used for >90 of the worlds top supercomputers ?

          1. Kebabbert

            @Chemist

            Jesus, not this one again. First of all, yes, Linux has low code quality. If you missed that, you are really revealing your ignorance. Maybe you should read up on Linux instead?

            You want proof of the low code quality? Here, straight from the Linux kernel developers themselves.

            .

            Linus Torvalds

            http://www.theregister.co.uk/2009/09/22/linus_torvalds_linux_bloated_huge/

            "The kernel is huge and bloated, and our icache footprint is scary. I mean, there is no question about that. And whenever we add a new feature, it only gets worse."

            .

            Andrew Morton says:

            http://lwn.net/Articles/285088/

            "I used to think [code quality] was in decline, and I think that I might think that it still is. I see so many regressions which we never fix....it would help if people's patches were less buggy."

            .

            http://kerneltrap.org/Linux/Active_Merge_Windows

            "The [linux source code] tree breaks every day, and it's becomming an extremely non-fun environment to work in....We need to slow down the merging, we need to review things more, we need people to test their f--king changes!"

            .

            And also other developers agree that Linux code is bad.

            http://www.forbes.com/2005/06/16/linux-bsd-unix-cz_dl_0616theo.html

            "[Linux] is terrible," De Raadt says. "Everyone is using it, and they don't realize how bad it is. And the Linux people will just stick with it and add to it rather than stepping back and saying, 'This is garbage and we should fix it.'"

            IT company CEO: "You know what I found? Right in the [Linux] kernel, in the heart of the operating system, I found a developer's comment that said, 'Does this belong here?' "Lok says. "What kind of confidence does that inspire? Right then I knew it was time to switch."

            .

            I have more links from Linux kernel developers. One say "The kernel is going to pieces". You want to read those links? Just tell me and I post more links for you.

            The conclusion is: Yes, Linux code quality is poor. Even Linus and other Linux kernel developers say that. I am just repeating what they say. If you want to attack me, dont. Attack instead Linus, Andrew, etc - I just quote them. Dont shoot the messenger.

            .

            .

            .

            Regarding the top-500 supercomputers running Linux. Jesus not that one also. Those supercomputers are just a huge cluster on a fast network. Just add a new node, and you improved performance. Anyone can do that. Just like this SGI machine, which runs tasks which are embarassingly parallell. Have you ever heard about P-complete problems? Or NC-complete? No? Study som parallell algorithm complexity theory, then. And then, come back.

            Regarding top-500 list, it says nothing. On 6th(?) place we find the IBM Blue Gene. It is one of the fastest supercomputers in the world. It uses 750 MHz POWERPC cpus. Does that mean that the POWERPC cpus are among the fastest in the world? No, it doesnt mean nothing. Top-500 runs Linux, because Linux is easy to strip and modify. Google runs modified Linux kernel, I have a link about that. Not here at work, but when I come home I can post it for you.

            Linux is easy to modify. It is a naive kernel. It is not scalable. There is no way in hell it scales to 1000s of cores. Linux developers have never had access to servers with 8 cpus or more. Linux is not tailored to that many cpus and it scales bad. But for clustered solutions (just a network), Linux is fine.

            Here we have a bunch of Linux scaling experts that "dispels the FUD from Unix vendors that Linux does not scale well" in an article:

            http://searchenterpriselinux.techtarget.com/news/929755/Experts-Forget-the-FUD-about-Linux-scalability

            "Linux has not lagged behind in scalability, [but] some vendors do not want the world to think about Linux as scalable. The fact that Google runs 10,000 Intel processors as a single image is a testament to [Linux's] horizontal scaling.

            Today, Linux kernel 2.4 scales to about four CPUs

            "-With the 2.6 kernel, the vertical scaling will improve to 16-way. However, the true Linux value is horizontal scaling [that is: clusters].

            Q: Two years from now, where will Linux be, scalability-wise, in comparison to Windows and Unix?

            A: It will be at least comparable in most areas"

            .

            According to the Linux experts, Linux scales to 10.000 cpus in one single image in the current v2.4, and in Linux 2.6 the kernel will improve to 16-way? Isn't that a bit strange? It doesnt add up, does it? Do you Linux fan boys ever think a bit?

            The ALTIX machine sold in year 2006 with 4096 cores, was using Linux v2.6 (which had only been released 2 years earlier). I find it extremely hard to believe that in v2.4 Linux scaled bad (2-4 CPUs) and two years later it suddenly scales to 4096 cores in the ALTIX machine? It takes decades to scale well. The only conclusion is that ALTIX machine is a cluster, otherwise Linux would have not chance to scale to 4096 cores in two years.

            Linux scales well on large clusters, yes. But that is not Big Iron. When people says Linux scales well (which it does) then they talk about clusters - that is scaling Horizontally.

            In other words; Linux scales well HORIZONTALLY, but still not good at VERTICAL scaling (that is those huge Unix servers with 64 cpus weighing a ton)

            1. Chemist

              So ..

              you are saying that SGI don't know what they are manufacturing and don't care that customers will notice that it's all snakeoil. Yeah, right.

              The fact, by the way, that you can read (any) comments in the kernel source is a strength. Who knows what nightmare is in the Windows source.

              From an Intel paper software.intel.com/sites/oss/pdfs/mclinux.pdf

              2.6 Linux kernels (which have better SMP scalability compared to 2.4 kernels)

              From a MIT paper pdos.csail.mit.edu/papers/linux:osdi10.pdf

              "First we measure scalability of the MOSBENCH

              applications on a recent Linux kernel (2.6.35-rc5, released

              July 12, 2010) with 48 cores...."

              "FreeBSD, Linux, and Solaris [54], and find that Linux

              scales better on some microbenchmarks and Solaris scales

              better on others. We ran some of the MOSBENCH appli-

              cations on Solaris 10 on the 48-core machine used for

              this paper. While the Solaris license prohibits us from re-

              porting quantitative results, we observed similar or worse

              scaling behavior compared to Linux; however, we don’t

              know the causes or whether Solaris would perform better

              on SPARC hardware. We hope, however, that this paper

              helps others who might analyze Solaris."

              From The Register http://www.theregister.co.uk/2010/11/10/redhat_rhel_6_launch/

              On 64-bit x64 platforms, it can scale to 128 cores/threads and 2TB of main memory using one set of kernel extensions and to 4,096 cores/threads and 64TB of main memory. (Not that anyone can build a system with that many processors or that much memory yet.)

              These are NOT clusters - I used a 1024 node cluster YEARS ago - these are for single image machines. I notice you've had the same arguments with different people all over the net. They all told you you were wrong.

              1. Kebabbert

                @Chemist

                "...I notice you've had the same arguments with different people all over the net. They all told you you were wrong..."

                So, why do you duck when I talked about the low code quality of Linux? You said I was wrong, and I posted only some of the links I collected, which shows that Linux has problems with code quality. You want to see more links where Linux kernel devs says the code is bad?

                So, I was not lying nor FUDing when I said Linux had bad code. Even the Kernel devs says so. I am right on this, even though people say I am wrong (including you). It is in fact, they who are wrong. So, no, I am not wrong. I know what I talk about. The people saying I am wrong know nothing, nor have they read the mail lists.

                .

                Regarding Linux scaling, yes it sucks badly. When you talk about scaling, we must distinguish between two kinds of scaling: horizontal scaling or vertical scaling. Vertical scaling is just one huge computer with many cpus. IBM biggest vertical server has as many as 32 cpus, it is called a POWER P795. There are also horizontal scaling, which is basically a cluster with many nodes on a fast network. IBM also has such machines, for instance all their supercomputers belong to this type and have 1000s of cpus.

                Why dont you think IBM has a vertical scaling server with 1000s of cpus? Answer: no one has! It is very difficult to scale well vertically. Horizontal scaling is easy: just add another node.

                SGI has no vertical scaling servers. The biggest vertical scaling servers on the market today, has as many as 64 cpus (Solaris). Even IBM greates Mainframes has 24 cpus. SGIs server with 1000s of cores, is just a cluster. Just look at the benchmarks of the SGI machine! All those benchmarks are embarrasingly parallell, cluster benchmarks. You have no idea what you are talking about.

                .

                Linux scales very well horizontally, on large supercomputers. For instance, Google uses 10.000 cpus for their Linux cluster. THERE ARE NO SERVERS WITH 10.000 CPUs TO BUY ON THE MARKET. Period. Google runs a cluster.

                Linux scales very bad vertically, on one single huge server with many cpus, as many as 32cpus or 64 cpus. Until recently, Linux had the Big Kernel Lock. One core wants to do something, and all other cores has to wait. There is no hell you can get scalability with BKL. You need much more fine grained.

                Here is proof, that I am correct, and you are wrong. Linux developer Tso, who develops ext4, says that until recently, Linux kernel devs had no access to machine with as many as 48 cores. See? The computers with 1000s of cpus are just clusters. SGI is one of them.

                http://thunk.org/tytso/blog/2010/11/01/i-have-the-money-shot-for-my-lca-presentation/

                "...and for a long time, 48 cores/CPU’s and large RAID arrays were in the category of “exotic, expensive hardware”, and indeed, for much of the ext2/3 development time, most of the ext2/3 developers didn’t even have access to such hardware. One of the main reasons why I am working on scalability to 32-64 nodes is because such 32 cores/socket will become available Real Soon Now"

                He confesses that Linux kernel devs had until recently servers with as few as 48 cores. That is chicken shit. How can Linux scale vertically when the devs have no access to huge servers with 32 or 64 cpus? Impossible.

                .

                Regarding Solaris. For decades Solaris has scaled to 100s of cpus vertically, on huge servers. Today, it scales to 512 threads. In a few years, Oracle will release a machine, so that Solaris scales to 16.384 threads. The Solaris OS treats each thread as a cpu. Everything is a cpu. The scheduler sees everything as cpus and makes no disticntion. So, 512 thread servers sold today, are 512 cpus. And a 16.384 thread server, Solaris sees a huge machine with 16.384 cpus.

                Your benchmark with as few as 48 cores dont prove much. Go up to 100s of cores, and Solaris will crush Linux easily. Official SAP benchmarks show that Solaris scales better on 48 cores, than Linux. Solaris server uses slower RAM and CPUs, and still wins over Linux on 48 cores SAP. But 48 cores is on the edge for Linux. Go up to 100s of cpus, and Solaris wins easily.

                1. Chemist

                  FYI

                  http://www.psc.edu/publicinfo/news/2010/101110_Blacklight.php

                  has 4096 cores running 2 Linux kernel images

                  1. Kebabbert

                    @Chemist

                    So what? As I said, the Linux Enterprise Scaling experts debunks the FUD from Unix vendors in this article:

                    http://searchenterpriselinux.techtarget.com/news/929755/Experts-Forget-the-FUD-about-Linux-scalability

                    "Linux has not lagged behind in scalability, [but] some vendors do not want the world to think about Linux as scalable. The fact that Google runs 10,000 Intel processors as a single image is a testament to [Linux's] horizontal scaling."

                    They talk about Google runs 10.000 intel cpus as A SINGLE IMAGE. But we all know that Google use a cluster.

                    Let me repeat again: There are no servers with 1000s of cpus for sale today. Not even the big mature Enterprise Unix vendors such as IBM, HP, Sun has/had. Machines with 1000s of cpus are clusters.

                    In fact, the biggest Unix server IBM P795 just recently released with as many as 32 cpus, need to have the mature Enterprise IBM AIX recently rewritten to be able to scale to as many as 32 cpus. AIX could not handle that scalability. And AIX is old and mature and has been run on huge servers with many cpu for decades. If AIX had problems to scale to 32 cpus, something is wrong if Linux is claimed to scale to 1000s of cpus. At the same time, the Linux scaling experts say in that very article I linked to:

                    "Yes, Linux scales excellent. Google uses 10.000 cpus as a single image. Today, in v2.4 Linux scales to 2-4 cores. In v2.6 Linux will even scale to as many as 32 cores!!! Yes Linux scales excellent!!! But Linux true strength is HORIZONTAL scaling"

                    Horizontal scaling are clusters.

                    I suggest you think about this for a while. Something does not add up. In that article: Linux scales superior. Up to 2-4 cores in v2.4, and it scales to 10.000s of cores. They talk about VERTICAL scaling vs HORIZONTAL scaling.

                    No one denies Linux scales excellent Horizontally (clusters) - it does scale well. Just look at the supercomputers and top500 list. They are all huge clusters and very successful.

                    When people say that Linux scales bad, they talk about Vertical scaling (one huge server with as many as 32 or even 64 cpus) - in that case, Linux scales to 8 cpus or less.

                    1. Chemist

                      You seem to be under the wrong impression..

                      A multiprocessor system that runs ONE copy of the OS is not a cluster - how can it be ?

                      You seem to think that a multiprocessor NUMA system is a cluster - it's not

                      I repeat the PSC Blacklight is an Altix® UV1000 with 4096 cores and 2 copies of LInux

                      http://www.psc.edu/publicinfo/news/2010/101110_Blacklight.php

                      A computer cluster is a collection of computers that are highly interconnected via a high-speed network or switching fabric. Each computer runs under a separate instance of an Operating System (OS).

                      A multiprocessing computer is a computer, operating under a single OS and using more than one CPU, wherein the application-level software is indifferent to the number of processors. The processors share tasks using Symmetric multiprocessing (SMP) and Non-Uniform Memory Access (NUMA).

                      1. Kebabbert

                        @Chemist

                        I suggest you read more about vertical scaling vs horizontal scaling. Then you can come back. The biggest mature Enterprise Unix servers have 32-64cpus today, after many decades of devlopment and research and development.

                        And suddenly, Linux comes from nowhere with low quality code and in a few years scales way better than the old mature Enterprise Unix to 1000s of cpus? Several years ago, that SGI Altix cluster with 1000s of cpus was sold back in the days even when Linux had the Big Kernel Lock. You can not get scaling with BKL. Impossible. But, if you use Linux as a cluster, you the BKL does not matter. Again: look at the benchmarks for the Altix SGI machine and convincer yourself they all are clustered benchmarks.

                        And people say that I am wrong on this? Am I also wrong about Linux having bad code? Again, I suggest you study some more and then come back. Until then: no, I am not wrong. It just does not add up what you Linux fanboys say. Think about it. If you dont see the glaring holes in your Linux advertising, I suggest you think harder. Linux has bad code quality, and it scales bad. Just read what Linux developer Ted Tso writes: until recently, Linux kernel devs had no access to machines with as many as 48 cores (which is chicken shit in Unix environments). And still Linux scales to 1000s of cpus? It does not add up! Something is wrong in your reasoning. Think a bit.

                        1. Chemist

                          Oh good grief !

                          I repeat the PSC Blacklight is an Altix® UV1000 with 4096 cores and 2 copies of LInux

                          http://www.psc.edu/publicinfo/news/2010/101110_Blacklight.php

                          1. Kebabbert

                            @Chemist

                            Good grief on yourself. Your post reminds me of this earlier post of yours

                            "...Your comment "unstable toy OS with low code quality" REALLY shows up your lack of knowledge..."

                            And as I proved by links to Linux kernel developers, Linux has low code quality. They say it themselves. I am just quoting them and this is not something I made up. I have collected more such links that I did not post here.

                            When you say I lack knowledge about Linux code quality, it only shows your own lack of knowledge. I suspect your earlier posts are ignorant as well. Linux scales good when it has BKL? You gotta be kidding. Ive tried to explain again and again but you refuse to understand, or are unable to. Hey listen, Chemist. Go and study some computer science and computer architecture and then come back. Your knowledge(?) in chemistry does not help you in this discussion.

                            And yes, I will continue to say that Linux scales bad, because it does. I AM right on this. Just look at the SGIs benchmarks! They are all clustered! Jesus.

                            1. Chemist

                              On the other hand ..

                              You, supported by no-one, have ranted on without any evidence and totally ignored any real references to actual hardware and systems. I think you should attempt to get yourself up-to-date with the whole area.

                              Blacklight, the World’s Largest Coherent Shared-Memory Computing System, is Up and Running at the Pittsburgh Supercomputing Center

                              PITTSBURGH, PA., October 11, 2010 — Researchers are making productive use of Blacklight. This new system, which the Pittsburgh Supercomputing Center (PSC) acquired in July (aided by a $2.8M award from the National Science Foundation) features SGI’s (NASDAQ:SGI) newest scalable, shared-memory computing platform and associated disks. Called Blacklight, the SGI® Altix® UV1000 system’s extremely large, coherent shared-memory opens new computational capability for U.S. scientists and engineers.

                              Featuring 512 eight-core Intel Xeon 7500 (Nehalem) processors (4,096 cores) with 32 terabytes of memory, Blacklight is partitioned into two connected 16-terabyte coherent shared-memory systems — creating the two largest coherent shared-memory systems in the world.

                              These days, the Top500 list of the world's most powerful supercomputers is dominated by cluster designs assembled from many independent computing nodes. But there's still a place in the world for an earlier approach, as evidenced by a new machine called Blacklight at the Pittsburgh Supercomputing Center.

                              http://insidehpc.com/2010/08/03/the-rich-report-the-16-terabyte-pc-sgi-bets-on-exascale/

                              i

                              nsideHPC: And that’s a single system image for all those cores?

                              Dr. Eng Lim Goh: Yes. It runs as a Single System Image on the Linux operating system, either SuSe or Red Hat, and we are in the process of testing Windows on it right now. So when you get Windows running on it, it’s really going to be a very big PC. It will look just like a PC. We have engineers that are compiling code on their laptops and the binary just works on this system. The difference is that their laptops have two Gigabytes of memory and the Altix UV has up to 16 Terabytes of memory and 2000+ physical cores.

                              So this is going to be a really big PC. Imagine trying to load a 1.5 Terabyte Excel spreadsheet and then working with it all in memory. That’s one way of using the Altix UV.

                              (I may be a chemist but my involvement in computing goes back almost 30 years from SC/MP, 6502, 6809, 68000 and PICs., PDP-11/34, Evans and Sutherland vector graphics systems, VAX/VMS all the way to SMP workstations and 1024 cpu Linux Clusters.)

  4. Piri Piri Chicken
    Thumb Up

    How much spam can that beastie pump out when it gets Pwnd.

    and how long is it going to take to reboot when the security bods decide its a good idea to set the "cleardown pagefile on shutdown instruction" .

    A 36GB node can take over 30 minutes from the issue of a shutdown command to the time it comes back and is usable again.

    Still nice and awesome though.

  5. Long John Brass
    Grenade

    SGI SPUNKS WINDOWS ON BIG ALTIX UV SUPERS

    There fixed the title for you

  6. TeeCee Gold badge
    Coat

    Windows on an Altix supercomputer?

    Someone's gotta say it:

    How many simultaneous instances of Crysis can it run?

  7. InITForTheMoney

    VMware / IBM Bricks rival?

    If this thing were certified for VMware and pitched at the right price, then it could be a serious rival to IBM's "Bricks" architecture of interconnecting 4 x86 nodes to form a single system image and then carving that image up in to virtual machines, especially as it would allow you to consolidate even large scale-up workloads. Imagine being able to virtualize your entire data centre on to one of these systems occupying just 4 racks, all networking being provided virtually inside of the single system and data being transferred effectively as fast as the NUMA interconnect, you could greatly simplify an IT Infrastructure, reduce networking assets and the number of individual management points. Provided you can get enough bandwidth to some suitable storage, this system could be an absolute killer.

This topic is closed for new posts.

Other stories you might like