Storage market

We’ve been for quite a long time into a trend of ever cheaper storage capacity.The GB price have been falling until the industry crisis from 2011-2012 (flooding in Thailand). Even if the trend remains it seems like something I’ve changed.

1) The market is no longer seeking cheaper capacity storage but instead faster storage. SSD technologies will change the market just like the multi-core change the CPU market when power and heat issues forced chip makers in the early 2000’s to move from single to multiple cores to enable CPU design to keep pace with Moore’s Law

2)Even the demand of storage capacity has relieve for personal files. People no longer need to download and store locally their movies and music. Instead they can subscribe to legal offer such as Netflix,Spotify or benefit from service like Google Drive or DropBox. Not only cheaper but also simpler to use they have changed the market for times to come.

 

1956, IBM’s hard drive ….10 mo

10mb

http://old-photos.blogspot.be/2011/06/hard-drive.html

 

The recent trend illustrated

blog-cost-per-gb-1 blog-historical-vs-actual-1

http://blog.backblaze.com/2013/11/26/farming-hard-drives-2-years-and-1m-later/

 

 

Price of 1 GB of storage falling over the past 30 yrs

Read the rest of this entry »

Gartner predict strong Hadoop adoption for Business Intelligence and Analytics

Source: http://www.gartner.com/newsroom/id/2313915

 

Gartner Says Business Intelligence and Analytics Need to Scale Up to Support Explosive Growth in Data Sources

Analysts to Discuss Growth in Data Sources at Gartner Business Intelligence and Analytics Summits 2013, February 5-7 in Barcelona, February 25-26 in Sydney and March 18-20 in Grapevine, Texas

 

Business intelligence (BI) and analytics need to scale up to support the robust growth in data sources, according to the latest predictions from Gartner, Inc. Business intelligence leaders must embrace a broadening range of information assets to help their organizations.

“New business insights and improved decision making with greater finesse are the key benefits achievable from turning more data into actionable insights, whether that data is from an increasing array of data sources from within or outside of the organization,” said Daniel Yuen, research director at Gartner. “Different technology vendors, especially niche vendors, are rushing into the market, providing organizations with the ability to tap into this wider information base in order to make sounder strategic and prompter operational decisions.”

Gartner outlined three key predictions for BI teams to consider when planning for the future:

By 2015, 65 percent of packaged analytic applications with advanced analytics will come embedded with Hadoop.

Organizations realize the strength that Hadoop-powered analysis brings to big data programs, particularly for analyzing poorly structured data, text, behavior analysis and time-based queries. While IT organizations conduct trials over the next few years, especially with Hadoop-enabled database management system (DBMS) products and appliances, application providers will go one step further and embed purpose-built, Hadoop-based analysis functions within packaged applications. The trend is most noticeable so far with cloud-based packaged application offerings, and this will continue.

“Organizations with the people and processes to benefit from new insights will gain a competitive advantage as having the technology packaged reduces operational costs and IT skills requirements, and speeds up the time to value,” said Bill Gassman, research director at Gartner. “Technology providers will benefit by offering a more competitive product that delivers task-specific analytics directly to the intended role, and avoids a competitive situation with internally developed resources.”

By 2016, 70 percent of leading BI vendors will have incorporated natural-language and spoken-word capabilities.

BI/analytics vendors continue to be slow in providing language- and voice-enabled applications. In their rush to port their applications to mobile and tablet devices, BI vendors have tended to focus only on adapting their traditional BI point-and-click and drag-and-drop user interfaces to touch-based interfaces. Over the next few years, BI vendors are expected to start playing a quick game of catch-up with the virtual personal assistant market. Initially, BI vendors will enable basic voice commands for their standard interfaces, followed by natural language processing of spoken or text input into SQL queries. Ultimately, “personal analytic assistants” will emerge that understand user context, offer two-way dialogue, and (ideally) maintain a conversational thread.

“Many of these technologies can and will underpin these voice-enabled analytic capabilities, rather than BI vendors or enterprises themselves developing them outright,” said Douglas Laney, research vice president at Gartner.”

By 2015, more than 30 percent of analytics projects will deliver insights based on structured and unstructured data.

Business analytics have largely been focused on tools, technologies and approaches for accessing, managing, storing, modeling and optimizing for analysis of structured data. This is changing as organizations strive to gain insights from new and diverse data sources. The potential business value of harnessing and acting upon insights from these new and previously untapped sources of data, coupled with the significant market hype around big data, has fueled new product development to deal with a data variety across existing information management stack vendors and has spurred the entry of a flood of new approaches for relating, correlating, managing, storing and finding insights in varied data.

“Organizations are exploring and combining insights from their vast internal repositories of content — such as text and emails and (increasingly) video and audio — in addition to externally generated content such as the exploding volume of social media, video feeds, and others, into existing and new analytic processes and use cases,” said Rita Sallam, research vice president at Gartner. “Correlating, analyzing, presenting and embedding insights from structured and unstructured information together enables organizations to better personalize the customer experience and exploit new opportunities for growth, efficiencies, differentiation, innovation and even new business models.”

More detailed analysis is available in the report “Predicts 2013: Business Intelligence and Analytics Need to Scale Up to Support Explosive Growth in Data Sources.” The report is available on Gartner’s website athttp://www.gartner.com/resId=2269516.

Additional information and analysis on data sources will be discussed at the Gartner Business Intelligence & Analytics Summit 2013 taking place February 5-7 in Barcelona, February 25-26 in Sydney and March 18-20 in Grapevine, Texas. The Gartner BI & Analytics Summit is specifically designed to drive organizations toward analytics excellence by exploring the latest trends in BI and analytics and examining how the two disciplines relate to one another. Gartner analysts will discuss how the Nexus of Forces will impact BI and analytics, and share best practices for developing and managing successful mobile BI, analytics and master data management initiatives.

Jaspersoft Big Data Survey

Jaspersoft’s new big data survey includes 631 respondents from the company’s user community. The survey includes respondents from more than fifteen countries that are primarily employed by companies with less than US$ 10M in revenue (30 percent).

In addition, most of the participants indicated they were in technical roles. Only 6 percent of respondents specified they were business users, while 63 percent were application developers and 19 percent were either report developers or business intelligence administrators. The high number of respondents in non-management roles is important to note because there is a risk it could skew the results. The participants may have detailed knowledge of implementation details, but may lack visibility across the enterprise to all big data initiatives that are underway.

Even if participants don’t have insight into what’s swirling in executives heads, they are aware the work on managing big data has started. According to the survey, twelve percent of the companies represented have already deployed a big data analytics solution. Twice as many, 24 percent, are currently implementing, and 13 percent plan to have a project underway in the next six months. Another 13 percent are planning a project in the next 12 months. However, a significant number of organizations, 38 percent, have no immediate plans for initiating a big data project.

Given all of the data that suggests big data can yield enormous business benefits, why are 38 percent of the companies represented choosing not to invest in big data? Respondents cited multiple reasons, but the most prevalent response (37 percent) was that their organization only had structured relational data. The second most popular answer, 35 percent of responses, was that the organization did not understand what big data is.

Other studies have also highlighted the lack of big data skills as a barrier to organizations capitalizing on the potential of big data. Both of the top responses in Jaspersoft’s survey show this is indeed a problem — especially the top answer since data does not have to be unstructured to be considered big data.
jaspersoftBigDataStudy.PNG
It is also interesting that most of the company’s included in the survey are not dealing with the massive data volumes often discussed by vendors and technology evangelist. Only two percent of survey participants said their project would manage exabytes of data, and just slightly more, eight percent, dealt with petabytes. Most respondents, 78 percent, indicated their projects dealt with terabytes of data or less. Twelve percent of respondents were unsure of their project’s total estimated data volume.

E-commerce, financials and customer relationship management enterprise applications were the biggest source of content for big data projects. Respondent could select more than one response, and 363 responses specified  data originated from one of the top three systems. Hadoop may be getting the most press for managing big data, but respondents overwhelming indicated (60 percent of responses) the data for their big data analytics project was stored in a traditional relational database. Only 26 percent of responses mentioned Hadoop HDFS or Hbase.

jaspersoftBigDataStudy2.PNG

The Big Picture for Big Data

Jaspersoft’s study supports what many of the other big data studies have shown. The majority of organizations are beginning to invest in big data, but big data skills remain a significant challenge. The technology landscape is still diverse, and relational databases continue to play an important role in managing growing data volumes.

Time to bust some myths of the "Big Data" era

So to say that “Big Data” might be f the hottest topics of our era. Time to bust some of its myths.

Big Data isn’t about large companies
Big Data” isn’t only for large enterprises with large amounts of data.  It isn’t even about solving the most complicated analytical issues only large corporations have.  If you have a company of more than 2 people, some transactional system (Salesforce.com, QuickBooks or even Microsoft Excel), are running an online business and want to take advantage of your website weblogs, you fit the Data opportunity profile.  Again, you’ll need solutions that can be up and running in no time and that offer data management, data integration and reporting in one package.  Your last issue should be hiring Data Scientists.  The majority of organizations are unable to produce a basic, real-time picture of their business in a format that’s usable by most employees.  Focus on the most necessary and broad base needs before you orchestrate for the most complex and specialized scenarios.

Big Data is Only About Massive Volume. Above and beyond the 3 Vs, it’s important to note that big data is about addressing the leading-edge business challenges that require analyzing massive volumes, real-time velocities, and/or multi-structured varieties of data. I’m talking about the notion of “whole-population analytics” against the entire population of data, rather than just the traditional capacity-constrained samples/subsets. Being able to drill into the entire aggregated population of, say, customer data, including rich real-time behavioral data, enables you to do more powerful micro-segmentation, fine-grained target marketing, nuanced customer experience optimization, and next best action.

Big Data isn’t just about the Data Scientist
Realize that the majority of the problem isn’t about the Data Scientist.  The bigger issue is the “data-literate manager”.  The problem isn’t so much about starting with the specialists in mind but rather starting with “everyday employees”.   Most employees don’t care about the data, they care about the end result – the analytics.  These analytics, on top of Big or “Small Data” are the closest thing they know to what matters most: making better decisions faster.

Big Data Means Hadoop. As you’ve noted, big data depends on massively parallel processing (MPP), plus in-database analytics, on analytic databases, file stores, document stores, and persistence infrastructures of various sorts: HDFS, HBase, Cassandra, RDBMS-based EDWs, columnar, key-value, graph databases, etc. We’re seeing far more hybrid (Hadoop + EDW + NoSQL + whatever) big data deployments these days than ever before, and range of hybrid models will continue to grow.

Big Data Means Unstructured Data. I agree with your focus on “multi-structured data,” but disagree with your statement that “the consistent trait of these varied data types is that the data schema isn’t known or defined when the data is captured and stored.” Considering that RDBMS-based EDWs are the established heart of big data, this statement is wrong. Better to say that “these data types vary in sources, in formats, and in when the data schema is known or defined: at capture, at storage, or when the data is used.”

Big Data is for Social Media Feeds and Sentiment Analysis. You should insert the word “only” between “is” & “for” to make this myth crystal-clear; because, clearly, one of the killer apps for big data is social media monitoring and sentiment analysis. But point well-taken: big data is the powerhouse refinery that is enabling business everywhere to continuously harvest intelligence not only from social media, but from log, event, sensor, geospatial, and other sources that aren’t strictly “social.”

NoSQL means No SQL. What’s happened is that “NoSQL” has greatly expanded the range of “QLs” in the big data arena: including Hive QL, Cassandra QL, SemWeb’s Sparql, etc. The confusing variety of big data “QLs,” including good ol’ SQL, absolutely demands some sort of query virtualization/abstraction/semantic layer, especially as the back-end data platforms (MPP EDW, Hadoop, NoSQL, graph, etc.) proliferate all over the cloud.

NYTimes: Welcome to the Age of Big Data

Great article from the New York Times: The Age Of Big Data

Must-read article, best quote:

Welcome to the Age of Big Data. The new megarich of Silicon Valley, first at Google and now Facebook, are masters at harnessing the data of the Web — online searches, posts and messages — with Internet advertising. At the World Economic Forum last month in Davos, Switzerland, Big Data was a marquee topic. A report by the forum, “Big Data, Big Impact,” declared data a new class of economic asset, like currency or gold.
 
 
 “It’s a revolution,” says Gary King, director of Harvard’s Institute for Quantitative Social Science. “We’re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched.”

 

Couchbase Survey Shows Accelerated Adoption of NoSQL in 2012

Couchbase today announced the results of an industry survey conducted in December that shows growing adoption of NoSQL in 2012. According to the survey, the majority of the more than 1,300 respondents will fund NoSQL projects in the coming year, saying the technology is becoming more important or critical to their company’s daily operations. Respondents also indicated that the lack of flexibility/rigid schemas associated with relational technology was a primary driver toward NoSQL adoption.

You can read the result of the survey here as well as some surprises in the survey at this page

NoSQL 2012 Survey Highlights

Key data points from the Couchbase NoSQL survey include:

  • Nearly half of the more than 1,300 respondents indicated they have funded NoSQL projects in the first half of this year. In companies with more than 250 developers, nearly 70% will fund NoSQL projects over the course of 2012.
  • 49% cited rigid schemas as the primary driver for their migration from relational to NoSQL database technology. Lack of scalability and high latency/low performance also ranked highly among the reasons given for migrating to NoSQL  (see chart below for more details).
  • 40% overall say that NoSQL is very important or critical to their daily operations, with another 37% indicating it is becoming more important.

 

 

Surprises from the Survey

Language mix. A common theme in the results was what one could interpret as the “mainstreaming” of NoSQL database technology. The languages being used to build applications atop NoSQL database technology, while they include a variety of more progressive choices, are dominated by the mundane: Java and C#. And while we’ve had a lot of anecdotal interest in a pure C driver for Couchbase (which we now have, by the way), only 2.1% of the respondents indicated it was the “most widely used” language for application development in their environment, behind Java, C#, PHP, Ruby, Python and Perl (in order).

Schema management is the #1 pain driving NoSQL adoption. So I’ll admit that I wasn’t actually surprised by this one, because I’d already been surprised by it earlier. Two years ago if you had asked me what the biggest need we were addressing was, I would have said it was the need for a “scale-out” solution at the data layer versus the “scale-up” nature of the relational model. That users wanted a database that scaled like their application tier – just throw more cheap servers behind a load balancer as capacity needs increase. While that is still clearly important, the survey results confirmed what I’d been hearing (to my initial surprise) from users: the flexibility to store whatever you want in the database and to change your mind, without the requirement to declare or manage a schema, is more important.

Hadoop Twelve Predictions for 2012

The past year was punctuated by significant advancements in Apache Hadoop and increasingly wider adoption of Hadoop technology across the enterprise. Companies are continuing to use Hadoop in exciting new ways to better serve their customers, inform product development and drive operational efficiency like never before. Join Mike Olson, founder and CEO of Cloudera, as he shares his twelve major predictions for Hadoop in 2012. He will also unveil predictions from key industry analysts.

Olson will discuss predictions for:

– Where new opportunities for Hadoop will be found within the enterprise
– How new projects being developed for and on Apache Hadoop will expand data analysis capabilities
– Ways that Apache Hadoop will help companies solve short term and long term business challenges

The  Twelve Predictions for 2012 by  Mike Olson

 

Facts and stats, MongoDB trend

Although NoSQL databases like Hadoop(Apache Foundation),Redis(VMWare),Cassandra (developed and used by Facebook) or CouchDB get a lot of media attention lately, MongoDB appears to be the product to catch in this emerging market.

Making some search over Google trend and Job Trend(using indeed.com) looking for various NoSQL solutions but the evidence point out the MongoDB to lead those trend

SourceForge,Disney,Craiglist are all using MongoDB, check for full adopter list here: http://www.mongodb.org/display/DOCS/Production+Deployments

 

 Google trend result for “MongoDB”

 

 Job trend from indeed.com result for various nosql solution

 Job trend from indeed.com result for “MongoDB”


 

NoSQL making big inroads in enterprise development

A new Evans Data Survey shows NoSQL Makes Big Inroads in Enterprise Development driven by Big Data related project.

NoSQL is being rapidly accepted by corporate enterprise developers in North America with 56% reporting at least some use of the schemaless database and 63% citing plans to use in the next two years according to Evans Data’s recently released North American Development Survey. NoSQL is considerably stronger in the enterprise segment than within the general developer population where 43% expect to use NoSQL.

The survey of over 400 developers conducted in May, 2011 is part of Evans Data’s Global Development Survey series. This showed use of NoSQL is rising in EMEA, where 39% of developers report plans to use, and APAC where more than a quarter of the general developer population report using NoSQL today and 68% have future plans.

“The advent of Big Data is driving adoption of NoSQL, and this is especially true in the corporate enterprise” said Janel Garvin, CEO of Evans Data Corp. “While it may have got its start on the web with innovations like Big Table and MapReduce, it’s the enterprise that can most benefit from NoSQL and developers realize this across all geographical regions.”

Other highlights from this comprehensive survey of North American developers include:

  • Although Mac OS is now more popular than Linux as a development desktop environment, Windows continues to dominate with over 80% using some version of Windows as their primary platform.
  • Almost 40% of North American developers are now working on apps for a wireless device.
  • Eighty percent of North American developers expect to be writing multi-threaded apps in the next two years.

The Evans Data Global Development survey is an in-depth survey of over 1200 software developers worldwide. It has been conducted twice a year since 2000 and follows development trends in three major regions. Content is broad based and includes platform and language adoption, mobile development cloud development, databases, development tools and methodologies and other current issues or interests.