back to article Big Data is like TEENAGE SEX

The joke doing the rounds on social media compares big data to teenage sex: everyone's talking about it, only a few know how to do it, they all think everyone else is at it and so pretend they are too. And humans fumbling about in the dark, so to speak, are the weakest link in the world of analytics and data, said Steve …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Anonymous Coward

    My company thinks...

    ... that the Big Data Team are the people you need to call to migrate a few databases from SQL2000 to SQL2008 (yes, really, SQL2012 is far too new-fangled). Unfortunately, the Big Data Team seem to think the same.

    I suppose that's like getting to first base over the summer holiday with a girl from a neighbouring school and then, when term [=semester] starts, telling your mates you went all the way with visiting movie star; the analogy seems to hold ...

  2. Anonymous Coward
    Anonymous Coward

    A word with that sub, please

    "Big Data is like TEENAGE SEX

    Everyone is talking about it, nobody doing it correctly...."

    Boy, are your social statistics out of date! Not sure how many people are talking about teenage (and pre-teen) sex, but for some decades now almost EVERYONE has been doing it.

    1. Matt 21

      Re: A word with that sub, please

      Of course you would say that :-)

      Coming back to "confirmation bias", it sounds to me that the' trying to say "it doesn't matter how stupid the data you get out the other end seems, you have to believe it without question".

      As for the "you don"t know what to look for", I think he's trying to say "just because you didn't find anything doesn't mean Big Data is no good, it just means you're not looking in the right place.", or "only really smart people can see the emperor's new clothes".

    2. Anonymous Coward
      Happy

      Re: A word with that sub, please

      "but for some decades now almost EVERYONE has been doing it."

      Aye, but I noticed you left 'correctly' out of that statement :-)

      1. P. Lee
        Facepalm

        Re: A word with that sub, please

        >"but for some decades now almost EVERYONE has been doing it."

        >Aye, but I noticed you left 'correctly' out of that statement :-)

        Also to note: teenage sex is stupid. The people pushing it are immature, irresponsible, selfish and likely to leave you with nothing but tears, an STD and a very expensive bump which you're ill-equipped to handle and which will drain your resources, after they've got what they want from you.

  3. Mike Pellatt

    He cannot be serious. Oh, he is....

    Dave Coplin, chief envisioning officer at Microsoft [.....] agreed the "main challenge is the human element" as big data forces a change to scientific approach - moving from trying to work out why something happens to a "world of correlation based on sample sizes".

    Errr, WTF ?? Yes, statistical correlation based on sample size is useful to validate theories (cf Higgs Boson, etc) - but that's based on a prediction of what should be observed if a theory about "why" is correct.

    You don't, and can't, work the other way round. You may get a clue from correlation as to the "why", but that's just a pointer for further research and then working out a further set of predictions that you then validate experimentally.

    For the result of applying correlation directly to causation, just look at all the crappy public health policies we've had over the last couple of decades, some of which are at last unravelling.

    No wonder HP found Autonomy wasn't worth as much as they thought.

  4. SJG

    ... and just as in teenage sex, the new breed of young data scientists will also eventually realise that they have spent a long time fumbling in the dark re-discovering the things that have been common practice for some time.

    However, I do wish they'd catch up a little quicker. Maybe one day they'll realise that:

    * there's no such thing as 'unstructured data' (if it's unstructured then it might as well be white noise)

    * it's daft to call a whole family of databases"NoSQL" then spend the next 3 years building SQL on top

    * taking data out of a relational database and putting somewhere else doesn't make the data "non-relational"

    * just because the security cameras in you HQ record more data every day than your customer ordering system, it doesn't mean they hold more value

    * when it's harder to merge data in your database than it is using vlookup in excel then you probably don't have the right database for analytics

    I guess I shouldn't really complain; after the SOA guys spent 10 years spending all the CIO's budget on integration projects that never seem to deliver anything more than a pile of questionable documentation, perhaps it's the turn of us data guys. Hopefully there'll be some budget leftover for some real data analysis after the "Bigdata" teenagers have finished spending on all those blue (or should it be white) elephants.

    1. Michael Wojcik Silver badge

      * there's no such thing as 'unstructured data' (if it's unstructured then it might as well be white noise)

      Facile. There's a world of difference between ambiguously-structured data, such as natural-language text or images, and unambiguously-structured data like the values in an RDBMS with well-defined columns. Quibbling about the ideal meaning of "unstructured" is unproductive.

      * it's daft to call a whole family of databases"NoSQL" then spend the next 3 years building SQL on top

      Inasmuch as "NoSQL" was backformed into "Not only SQL" shortly after its introduction, this objection is also baseless.

      * taking data out of a relational database and putting somewhere else doesn't make the data "non-relational"

      Data itself is never relational or "non-relational", so this is vacant. Data can be in a relational structure, or not; if it's not available in a relational structure1, then it's not available in a relational structure.

      * just because the security cameras in you HQ record more data every day than your customer ordering system, it doesn't mean they hold more value

      Straw man. No one cited in the article (even Microsoft's "envisioner") claimed otherwise.

      * when it's harder to merge data in your database than it is using vlookup in excel then you probably don't have the right database for analytics

      Perhaps, though that's such a vague claim it could mean almost anything.

      I'm all for skepticism and generally even for contrarianism - the IT industry, broadly speaking, is far too fond of its fads. But this sort of "I work in a narrow part of this field and it's all a homogenous lump where everyone's problems are the same and no solutions other than mine are valid" crap is both tiresome and useless.

      1And necessary normal form, modulo denormalization with controls to ensure a valid transformation to normal form always remains possible.

  5. Paul Hovnanian Silver badge

    '"main challenge is the human element" as big data forces a change to scientific approach - moving from trying to work out why something happens to a "world of correlation based on sample sizes".'

    But then we always hear that correlation is not causation. Sure, you might be able to wring a few interesting patterns out of the data. But if you can't get back to some underlying mechanisms behind the measurements, they mean nothing. You have what is called 'cargo cult science'. We did everything correctly based on the data we had. But, lacking a model of the phenomenon, we missed the pieces for which not data had been, or could be, collected. We kept up the airstrips, but the planes stopped coming.

    We are hard wired to find patterns in random noise. Its a survival skill from when spotting the tiger in the tall grass meant survival. But that's a skill humans share with animals. Where we differ is in deduction. Getting to the underlying 'why' behind the data.

  6. veti Silver badge

    How far does the analogy go?

    What's the equivalent of "unwanted pregnancy" here? Or "STD"? Sexting?

    And do we elder-generation types risk getting arrested if we take too close an interest?

    Actually, that seems disturbingly plausible.

  7. Amorous Cowherder

    As with most things , new paradigms are great and have a genuinely useful purpose and some get some great use out them. The trouble with IT is that everyone is after the next big thing, we're all bored with "the cloud", it's passe now and "So last year darling!", so like the media we have to drum up some new thing to keep companies selling stuff and CTO's arses on their expensive office chairs.

    As another poster has said unless the data has some value, simply collecting more and more of it will not somehow magically give it value. One thing traditional databases have is a design, they are conceived to store a certain type of data in a an organised structure and to a certain extent you will get mostly useful data from it as you are constrained in what you can put in by design. BD on the other hand is simply an excuse to just grab as much data as you possibly can then decide what the hell to do with it after you have it! "Quick grab all the JPGs the company has on it's 53TB NAS filers and see what we might have!". Ok but we're a company selling copper pipe and plumbing fittings, will this give us anything useful after spending £1.5m on kit to do this?!

    Don't get me wrong I can see some uses for BD type paradigms, I'm sure some have found it to be very useful but once again we see a new technology and everything is shoe-horned to use it without any thought about it's suitability to the tasks, simply because it is the latest buzzword.

  8. Harry Kiri

    The problem with Big Data / The Cloud (tm) is while we all joke about it being a load of old cobblers, the Govt sets up Big Data and Cloud strategies to pour tax-payers money into the coffers of suppliers.

    So its working perfectly!

This topic is closed for new posts.

Other stories you might like