The Sparc T4 processor that Oracle expects to ship before the end of December for its entry-level and mid-range server platforms is probably the most important chip that either Sun or Oracle has put into the field since the dual-core UltraSparc-IV+ "Panther" arrived in October 2004. A lot of business is at stake – and so is the …
My favourite java image
That roadmap is my favorite "all you need to know about java" image.
You increase CPU raw performance by 32x, number of coures 4x, memory capacity 10x and what do you get for all your efforts - 10x java performance increase.
So.... are you bemoaning....
1) The existence of languages that are not glorified macroprocessors?
2) The fact that other languages that are not glorified macroprocessors are less mainstream?
3) The von Neumann architecture?
4) The fact that in this universe, X-way processing does not guarantee a speedup of X?
T3 and L3 cache, crypto unit algorithm support
"The Sparc T3 chip, by contrast, did not have an L3 cache at all, either on or off the chip, and this no doubt affected its performance detrimentally."
So, you're saying that 4MB of L3 (shared, plus 128kB*8 of L2, not shared) must necessarily beat 6MB of L2 cache (all shared)? Could you perhaps explain your thinking here?
Oh, and a nitpick: Only the first four algorithms mentioned in relation to the crypto unit are in fact ciphers. CRC32c is a checksum and not "encryption". The next four are cryptographic hashes and the last three are specialised instructions to speed up parts of certain algorithms, but are not themselves "encryption" either. Of course, this horribly buggers up the flow of a cheesy write-up, but as a tech reporter I do think you can be expected to understand the difference and come up with cheesy write-ups that retain correctness.
Explaining just why that crypto unit is a big deal for databases would be useful for those of us who aren't big enterprise datamongers. I can guess, but can you explain?
Not just databases
KASUMI is a telecomms cipher.
"the impeding Solaris 11"? Is that what you mean, or, is it perhaps a Freudian typo for "the impending Solaris 11".
(Reminds me of the introduction to a VMS manual that read "Between you and the Vax is VMS" :-) )
Looking nice Oracle..
Let's see what they actually deliver, but it all looks promising for all the Solaris shops out there. I know my next contract (An absolutely massive Solaris shop) has T4's on the ground and are evaluating them, so be nice to get my hands on them.
Awaiting the usual suspects...
if they actually wanted to win back the science market they should consider building a workstation based on it... But I guess the days of the Sun workstation are permanently over...
Oracle, of all companies, isn't really gooing care about the scientific community, they don't have much spending power and when they do spend lot's to buy HPC setups HPC builders such as IBM/Cray/SGI don't make much money at all, hence Oracle not competing in that market.
If it doesnt make them money they won't do it, simples. One reason why they have lots of it.
The scientific community
are most likely to use Linux workstations, Linux farms or supers running Linux.
@Paul 77, @Chemist
Why would you do that when you could just access a server across a network? Back when I first started you just accessed some server from an X terminal. Alright, you'd probably have to use a Linux PC instead of an X terminal these days, but otherwise nothing's changed.
You are most likely correct. But by Linux I suspect you really mean Linux on x86/x64. Nothing wrong with that per se, but there can be very good technical reasons why x86 might not fit the bill. Not every academician (or developer for that matter) is best off hosting their work on x86, and it's their hard luck if they don't look around to see what else is available. As Kebabbert points out, Sparc T3 is very good for crypto. That might be handy if you're hosting a large website that has https only access. Similarly anyone doing large amounts of decimal (*not* floating point) math really needs to take a look at POWER, which is why IBM do quite well in the banking / financial services sector.
Not even IBM want to do it anymore.
See also: Blue Waters.
Sure, my workflow would be twin Xeon Linux workstation, if need more grunt x86 Linux farm (1024 nodes), finally supercomputer somewhere.
Well, we had SunOS/Solaris workstations from Sun 3/60. through SPARCstation's 1/IPX/IPX/5/10/20, Ultra's 1/5/10 and Sun Blades 150 and 1500. It came time to replace the 1500's and... we couldn't. So we ended up getting a Sun Enterprise T5120 to run our stuff on. Problem is that we have some legacy stuff that will only run on SPARC, so we will have to keep it going for a while...
So, we are continuing to use the Sun Blade's as X terminals and to run some things.
I wonder if there's an open-source SPARC emulator that will run on an x86 Linux box... :-)
All the pros of them outweigh Oracle ever releasing another workstation. My personal favorite is never needing to maintain hardware at a user's desk. Depending on your needs, either a few large servers, or farm it out so that each user get's their own server.
On chip crypto
I read that a high end Xeon cpus doing crypto achieves really low performance. That is because it is done in software.
I dont remember the exact numbers, but something like this:
The high end Xeon cpu reached ~ 10-20% of the throughput compared to the Niagara T3 cpu. Additionally, the Intel Xeon cpu was ~ 80% loaded, whereas the Niagara T3 was ~10% loaded. Thus, on chip crypto in hardware helps a lot. The Niagara T3 chip can encrypt / decrypt almost in real time, costing very little resources.
Will they publish a documentation, or will this remain a black box that only Oracle can write software for? I could never find the documentation for the T3, not even the list of instructions accepted by the processor...
T3 Core is just SPARCV9, which is documented fully on sparc.org
You didn't look very hard.
> T3 Core is just SPARCV9
Yeah, just like T4 is just SPARCV9 and the latest Intel is just amd64. How about those latest and greatest new instructions that they keep talking about? Last time I checked the crypto unit certainly wasn't documented on sparc.org.
Single thread performance on these chips has been the bane of my existence. It was anemic when the chip was new, now it's more of a sad joke.
Shouldn't have been buying Massively threaded CPU's which were never designed for single thread oooomph for single threaded workloads then should you?
Just another speed bump
..for x86 to run over. Seriously, why would anyone trust Larry to pick their pockets with this thing? Just another way for him to lock you in.
When you get get 64 socket x86_64 box which scales linearly in a single system image then let me know, I'll be the first to dump SPARC/POWER, but that isn't going to happen for a long time yet.
Geez what is it with the commentards today.....you all fail.
You know that in some benchmarks, the Niagara utterly crushes any x86?
It seems having to pay for a massive 64 socket system for a single system image shows somebody didn't architect their business system software properly. The trend for most customers is subdivide the hell out of things into many vm images.
The T-series processors are for 1-4 socket systems, which is a space that x64 does just fine in. Oracle/Fujitsu do have the SPARC64, which scales to 64 processors, but it's not exactly the fastest chip in the industry.
It's already way too late for Oracle in my organization.
We migrated to RHEL Linux on Dell H/W and went from 60% CPU utilization to 4% running the same code (PeopleSoft).
Cost? About half that of Oracle hardware...
and Dell were happy to offer a 3 year H/W maintenance contract... Oracle only one year at a time.
I lurv such benchmarks you talk about.
I heard of a big company that migrated from Solaris to Linux, and they increased the performance tremendously. That is strange, because many benchmarks shows that Solaris is faster than Linux - on same hardware (I have several white papers and links on this, just tell me if you want to see). So I dug deeper into this.
It turned out that they migrated from 800 old SPARC servers at 1GHz - to 4.000 Linux servers using x86 Intel duo core at 2.4GHz. The motivation for this weird comparison? Both solutions had a similar price tag. Is anyone surprised that 4.000 Linux servers at 2.4GHz gives better performance than 800 old SPARC servers?
So, do you agree that Linux is faster than Solaris? Maybe not, eh? To compare Solaris to Linux, as you speak of, you need to compare Solaris to Linux - on same hardware. And if you do that, Linux looses big time. Solaris gives better throughput, I/O, TCP/IP, etc on same hardware. Just by switching OS. Of course I have links on this. Anyone wants to see them? (I have gotten complaints that I always backs up my claims by posting links to white papers, benchmarks, etc)
So, Brian 39, did you migrate from 20 old single core SPARC servers to 400 new shiny Intel Westmere-EX octo core at 2.4GHz?
Hope your evidence is actually impartial and not the usual and rather shoddy marketeering material. I do value demonstratable claims much more than unbacked assertions, but I'd also like the backing to be readable--neither fluffy nor dry.
sales force smack down
Quite disappointed as the usual partisans and suspects are mia on such threads. Come Matty B unload on the sunshiners. Bah no yawn icon.
Our dear Matty B is in depression, having realized that his beloved HP is being managed by bunch of clowns who don't know their arse from their elbow. He's probably also quite upset at having lost 44% of the value of his share's since SAPman took over....Lol.
RE: Just another speed bump
--- Seriously, why would anyone trust Larry to pick their pockets with this thing?
Why would anyone trust an anonymous comment regarding whether a known person should be trusted?
--- Just another way for him to lock you in
Who can lock someone into what?
SPARC architecture which is not controlled by any one company, Larry's company is just one who participates.
Anyone can participate in the open-standards group, anyone can build products, anyone can purchase products!
@Anon: RE: Documentation
Anon> Will they publish a documentation, or will this remain a black box that only Oracle can write software for? I could never find the documentation for the T3, not even the list of instructions accepted by the processor...
The Oracle's T3 data sheet indicates the SPARC V9 architecture is implemented.
Products and Services -> Server and Storage Systems -> Sun Servers SPARC Servers -> T-Series -> Brochures and Data Sheets -> Data Sheet: SPARC T3 Processor [pdf]
"16 SPARC cores with full binary compatibility based on SPARC V9 architecture"
After entering the SPARC URL, 2 mouse clicks gets to the specifications:
SPARC.ORG -> SPECIFICATIONS
While we are at it, here are the extensions to SPARC V9 for Fujitsu processors.
SPARC.ORG -> SPARC Enterprise Documentation
Happy Programming to newbies who need SPARC guidance!
Here we are celebrating the T4 yet I find it amusing to see a link posted supporting the claim that for certain workloads a T-series CPU is just fantastic.. We're celebrating the chip we know Oracle will push in a sales meeting yet you look at the link you'll see the irony.
They used a Fujitsu for the DB layer. Not a T-series. Once the T4 is out which one do you think they'll promote?
(Kebbabert, I know your going to come valiantly to Sun's defence, I know you used the word "some", save your fingers from arthritus, I just find it funny)
Also, for those that still want to bash on about scalability at any cost you might want to remember IBM do a 4 socket, 10 core Intel chassis which allows two chassis to be coupled via QPI to give an 8 socket, 10 core setup. 80 cores, 160 with hyper threading & all running very bloody fast. HP have similar offerings, Oracle do as well though they might not want to mention nasty Intel servers these days.....
Sure someone is going to shoot me down but come on, Linux is more than NUMA aware (not like Sun had any NUMA issues of there own, think 4900/15k/25k) and yet this kit is a lot lot cheaper than Oracle sparc servers. There are a lot of real cores in a 10 core, 8 socket box.
You even get to choose the Linux flavour and which contract level you want & with the money saved you can buy the same again & cluster the bloody machines!
Open your eyes, not every single server deployed has to go onto Oracle Premium support.....
Oracle/Sun does not have a glue chip
IBM and HP have special chips to create those 8 socket boxes. Oracle does not have the technology to create a real 8 socket box.
Oracle's 8 socket intel box has to use the intel chips as gateways from chips 2 to 8. That is what happens when a chip only has 4 QPI links. One goes to I/O and the other three go to 3 of the remaining chips of 7 that the chip needs talk to.
IBM is x5, HP is XNC, Oracle is "sorry we dont have one"
Which is also why if you buy the Exadata X2-8 you are admitting you will spend millions on bad technology.
Maybe this doesn't make sense, but a 4-cube should be doable with those 4 links, or even a 2d mesh. Or they could perhaps go back to an updated version of the crossbar switch the e10k had. Why the need for intel or ibm glue chips?
And here I come! :o)
Well, I just randomly picked one world record that T3 has. I can post other if want to see. Do you? Lets face it, the T3 has superior throughput because it has many light threads. But, as Oracle says: it sucks on single threaded work. When we talk about throughput benchmarks, Niagara rules. For instance, serving many 1000s of clients with light threaded work, such as webservers. That is what Niagara is built for. Dont you agree? x86 can not handle that kind of workload.
I mean, even the old Niagara T2+ was good at throughput. You need six IBM POWER6 servers with fourteen (14) POWER6 cpus at ~4.7GHz in total, to match one Niagara server with four T2+ cpus at 1.6GHz when we talk about SIEBEL v8 benchmarks. Niagara can serve far more clients than IBM POWER6 can. How many IBM GHz do you need to match one Niagara T2 on these kind of workload? Ten times as much?
So, do you really find it amusing that "...the claim that for certain workloads a T-series CPU is just fantastic..."? Or dont you agree that T-series cpu is fantastic for certain type of workloads? Like, much faster than Intel x86 and faster than POWER7? At half the clockspeed and no cache. What does that tell you? It tells me that Niagara excels. It is fantastic. What is your conclusion? That Niagara does not excel on these types of workload? If it is fastest in the world, it is not fantastic?
"...HP have similar offerings, Oracle do as well though they might not want to mention nasty Intel servers these days....."
Maybe you just dont know of Oracles products. But I do, I have been at Oracle events and learned more. Of course. :o) For instance here are some world records that Oracle x86 has
As long as the throughput from Niagara servers are massive and beat IBMs offerings, I do not really see the extreme need for 8 chip servers? If 4 cpus suffice to beat the competition, then there is no really need for 8 socket servers? Or, do you prefer 8-socket servers, just for the sake of it? To serve 1000s of clients with decent latency, Niagara is king. Niagara is tailored for that type of workload.
I dont really understand why you claim that Exadata X2-8 is "bad technology"? Have you any benchmarks or evidence? To me, at the Oracle events, Exadata X2-8 seemed really really impressive with extreme performance. If you dont have any clue about Exadata X2-8, why are you stating such things?
Glue or no Glue
What @Allison Park is referring to is the lack of sophisticated elecronics that properly extend from 4 sockects to 8 with Nehalem EX and Xeon E7 (Formerly Westmere EX). The Sun x4800 (aka G5, X2-8) is a glueless approach to extending beyond the direct-touch 4 sockets (3 QPI inks) to 8 sockets. In the IBM case the electronics are called EXA chipset (vestiges of Sequent). The value add (heavy lifting) done by IBM for this architecture should not be dismissed by anyone who lacks deep understanding. I have deep understanding and hands-on engineering experince with the Sun x4800 from my Exadata engineering past. I know the throughput and scalability challenges that server suffers. IBM's achievements in this space are astounding. Consider, for instance, it is actually less CPU stall time to locate and fetch a cacheline from the EXA chipset than local memory in the Nehalem EX offering. That doesn't actually surprise me. In spite of the fact that I've seen unknowlegable people dismiss these "glue" offerings elsewhere on the web, I have seen and experienced the benefit. For example, one of the first "glue" systems to integrate with Intel x86 chips was the Sequent IQ-Link. The card connected to the Intel P6 (Orion) bus which the Pentium Pro MCM was attached to. The Sequent "glue" processor in that case was 510-pin Gallium Arsenide. It was not until 3 years later that even Intel had a 500+ pin processor. That processor was able to return a line to the Pentium Pro CPU faster than was possible from local memory via Intel's own memory controller.
Sorry for the "memory lane" but glue matters. On the other hand, the jury is out on whether T Series processors matter.
Ok, that was interesting information. Thanx for that. I see the advantages of glueless design. But still, if a 4 socket server is faster than a 8 socket server - does it matter if it is with glue or not?
I mean, you need six (6) POWER6 servers P570 etc with fourteen (14) POWER6 cpus at 4.7GHz to match one single Sun T5440 with four (4) Niagara cpus at 1.6GHz - when we talk about SIEBEL v8 benchmarks. I dont really understand the complaints that Niagara does not scale above 4 sockets, whereas POWER6 does scale? I mean, Niagara is much faster in some benchmarks. It doesnt matter if you have 14 POWER6 cpus or 28 POWER6 cpus or whatever - when the performance is quite bad. As long as I get the performance that I need, I dont think it does matter how many sockets a server has
cut and paste 2008 benchmark vs a 2006 benchmark
please join this decade if you want to make real comparisons which have credibility
The stronger single-threaded performance should help decision support style workloads, as well as heavier duty jobs in the middleware space, such as supply chain calculations.
Also, with legacy Solaris container support on Solaris 11, those old Solaris applications should run well.
I share some of Paul77's sentiments that he expressed in his comment above. I too miss the days of the SPARC-based Sun workstation, and ever since I watched the video announcement for the original Niagra UltraSPARC T1 on the Sun website back in 2005 I dreamed of somehow obtaining a next-generation Sun workstation with one (or more) of these intriguing processors contained on the inside. Since then every time Sun, and now Oracle, announces a new generation of the UltraSPARC T-series processors I look at my aging collection of beloved UltraSPARC IIIi-based Sun Blade 1500's and sigh, thinking about what could have been if Sun kept on making cutting-edge SPARC-based workstations.
I am sure that other readers here could argue that the UltraSPARC T4 is not designed or ideal for workstation use, that workstations based on T4's would be too expensive to be competitive, that the workstation market has shifted overwhelmingly into the x86's corner, etc., and they would probably be correct on all counts, but that doesn't stop me from still dreaming about having a T4-based workstation to play with, as irrational as my longing for one may be. I suppose that my wanting an UltraSPARC T4-based workstation is somewhat similar to a petrol-head wanting a someday own a supercar-- a supercar is not as sensible as a compact car or a hatchback as a daily driver, and general-purpose cars definitely have more useful utility to them than supercars do, but that doesn't stop you from lusting after a supercar anyway, no matter how ridiculous it is!
have one's cake and eat it too
The increase in single threaded throughput is great, but I fear that it comes at the cost of the total chip throughput. I mean the whole idea with Tx chips up until now have been to sacrifice single threaded throughput for chip throughput.
And a factor of 5 on single threaded throughput still isn't enough IMHO to match most of the competition.
Processor Core Licensing Factor for T4?
The largest cost for the server will be the Oracle database or WebLogic or other software licenses, the T3 had a 0.25 core licensing factor, hopefully the T4 will be 0.5 to keep the per-chip price the same.