You are currently browsing the tag archive for the ‘BigData’ tag.
June 27, 2012 in Curation, Social Media, Technology Tools | Tags: BigData, collaboration, Comprehension, Curation, Data Capture, Distraction, Filtering, Google, Internet, Meaning, Microsoft, Mobile, Pearltrees, Search Engine Optimization, SeeSaw, Social media, Social Network, Twitter, Understanding, Workspace | Leave a comment
In a previous post I presented the challenges involved in deriving value from Big Data and in particular unstructured Big Data, which increasingly dominates the Social sphere. The tools that will enable us to make sense of the plethora of conversations, contributions and observations are still in their infancy and the likelihood that we will have to rely on human scanning and analysis for the short to mid term appears inevitable.
In his book, “The Shallows“, about the effect of the internet on our brains, author Nicholas Carr discusses the demands on our working memory or, as he puts it, “our mind’s scratch pad”. Working memory is our CPU and the agent that identifies and transfers information to our long term memory, which is our data store. It is therefore a pivotal component in our ability to seek out and retain knowledge.
“The information flowing into our working memory at any given moment is called our “cognitive load“. When the load exceeds our mind’s ability to store and process the information… we are unable to retain the information or to draw connections with the information already stored in our long-term memory”
The internet and particularly the social web is constantly bombarding our working memory with stimuli that are intentionally distracting (this post is a representational example with multiple hypertext links in the first two paragraphs). The pervasiveness of mobile connectivity means that we are always on-line, and never at rest from the interrupting nature of the medium. This suggests we need assistance in processing the social media stream, firstly in being able to recognize important and relevant information, and secondly to earmark that information for further analysis, refinement or augmentation. The first requirement is for a social media radar, the second for a social information refinery.
Search remains our favorite tool that we use to seek out information. Google dominates with 4.7 billion searches a day, but Twitter is not far behind (considering its size) at 1.6 billion, and both services are growing fast (Google at 30% increase per year – Twitter at 50% per year). However there are certain limitations in both search functions evidenced by the changes announced by both companies in the past 12 months. Google has recognized the effects of Search Engine Optimization and the fact that we demand results that are more contemporary. Both companies have added semantic search elements to their armory, an acknowledgement that searches need to be relevant to time, location, context and searcher’s intent. However such elements are far from comprehensive. Furthermore because the definitions, rules and algorithms are unpublished the searcher is dependent on Google’s and Twitter’s interpretation of what was intended. We are still a code generation or two away from being able to parametrize semantic search using our personal or group definitions and meanings.
Turning now to the need for refinement, which is the ability to analyze what we have found, understand its value and relationship to other captured information, and to provide single or collaborative commentary on the discovery itself. Once again the technology has provided some rudimentary tools, commonly called curation tools. Related to museum curation, these tools capture and display information of interest. There are over 40 such applications and each provides a web page in which curators can display their captured content. In many cases the tools allow for comments to be added as separate components, listed in historic order with the most recent first. Some curation tools, such as Pearltrees, support content linking, allowing curators to provide insight into relationships between islands of information. Many tools provide a browser add-on that will enable the curator to save browsed/searched content to the curated web site. The tools are improving but there is still a small disconnect between the radar and the refinery functions.
Until now. SeeSaw is still a fledgling product yet it offers to bridge that gap between scanning and curation, and holds great promise in being able to map content relevancy and provide a lens on both active streams and refined content. What is particularly appealing about SeeSaw is that it is built for visual scanning. as opposed to lexical scanning. “See”, the radar component of the tool, filters live social media streams and displays the visual content of links and embedded graphics. For Twitter this is a vast improvement on current viewing dashboards such as Hootsuite and Tweetchat. where speed reading is essential in keeping up with fast moving activities and events such as chatrooms. SeeSaw not only displays the images but can also stream the video links within individual messages.
This visual facility has three immediate advantages. Firstly the participant can remain in the chat stream without having to hyperlink to a new page, which has been a major distraction and an extended opportunity for further interruptions and distractions. Staying in the same window enables continuous contact with the flow of active conversations. Secondly it enables the viewer to see trends within the stream, connections (ie who is talking to whom) and tangential conversations. Thirdly, and most importantly, it enables the participant to save active content in the stream to the Saw side of the product, and the bridge to the product’s curation functionality. A simple toggle button allows the reader to change between the active stream (See Board) and the curated site (Saw Board). In practice this allows the participant to remain in contact and context with the conversation, as opposed to the multi-windowed, heavy interrupt laden environment to which we have been constrained. Reflection and analysis can now occur after the scan or chat, ensuring that focus and attention can be appropriately applied to both.
SeeSaw is more than welcome to my active toolkit, it allows me to “Embrace, Extend and Expand” (via EMC, via Microsoft) my social media environment. It is more than a helpmate – it has the potential to be my primary Social Media Assistant.
- At last, science explains why there are Internet trolls [infographic] (betanews.com)
- Working Memory and the Classroom (my.psychologytoday.com)
- 40 Social Media Curation Sites and Tools (socialmediapearls.com)
- The Age of Relevance (techcrunch.com)
June 12, 2012 in Big Data | Tags: Analysis, BigData, collaboration, Curation, Data Definition, Data model, ILM, Knowledge, Knowledge Management, Lifecycle Management, Meaning, Networking, Open Source, Peter Drucker, Semantics, Social media, Social Media Value, Social Network, Tony Walker, Unstructured data | Leave a comment
The more we know the less we understand. Nowhere is this more true than on the Social Network, where volume, velocity, volatility and variability are increasing on a daily basis. Those 4 V’s are part of a definition of big data, which includes both structured and unstructured data. We may have a reasonable chance of obtaining valuable information from the structured data population. That depends, of course, on the extremity of any single one or combination of the 4 Vs, yet author, time stamp, location or any other tag that accompanies a communication is easily identifiable. Howerver unstructured data poses a challenge several orders of magnitude greater. Structured data benefits from data models, data definitions and rules that enable us to extract reports and analyses even to the point of discovering new relationships and information from the regimented data. To do so, we need to nurture and maintain these structures, to prevent a degradation of data quality and avoid conflicts and loss, a goal that often eludes the best efforts even in mature IT shops. However this is not the case for unstructured data.
In general there are no data models, no data definitions, no rules and no discipline of housekeeping for unstructured data in Social Media. At least nothing that is commonly held. Individually, of course, we have an idea of what we are communicating, and we probably use both our own data definitions as well as those we assume are being used by others in any conversation; but these are amorphous concepts and certainly nothing that can be referenced by others or by cyber analysis. The same is true to a lesser degree in IT organizations and the worlds behind the firewalls. At least in those environments best practices such as change management and planned organization of unstructured data (viz Sharepoint) should ensure some semblance of control and order if not insights into hidden information.
We do however have some rudimentary tools at our disposal, but like early man our technical bows and arrows are a poor match against the stampeding herd of beasts that is the social network stream. So like our ancient ancestors we have to develop strategies and skills that help us survive and thrive in this world of pervasive communications. Tony Wagner, author of “The Global Achievement Gap” identified three such skills that he believes are fundamental for us to foster and teach. He calls them the “three C’s – critical thinking, effective oral and written communication, and collaboration.” He also believes that this should be the prime focus of our educators, and that we should establish “a new National Education Academy, modeled after our military academies, to raise the status of the profession and to support the R and D that is essential for reinventing teaching, learning and assessment.”
Knowing how to perform the three C’s is therefor one of the keys to success. Being able to put this knowledge into practice, and bring organization and governance to bear on the resources and data requires additional skills if enterprises plan to approach and consume the labor and thoughts of distributed social resources.
Taking these observations a little further I believe the following 5 components are necessary in order to navigate, participate and collaborate in world of social information.
1. Understanding – we need a better understanding of what we are dealing with in the social media so that we can properly distinguish and farm target crops whether they are preferences, demographics, opinions, gossip, information, knowledge, wisdom. or something altogether different. However to improve that comprehension we need to be more aware of the dynamics of how we think, analyze, and communicate effectively. What, for example, is a thought, and what are the attributes of thought that make it consumable? We have a notion of answers to those questions but they are personal and subjective. Yet we cannot rely solely on subjective interpretation, so we need a shared and objective framework or model of knowledge. Knowledge is the loadstone of the social community, and the more we understand it, its nature, behaviors and properties the more we can improve the discovery, sharing and use of valued information in the social stream.
Peter Drucker(1909-2005), one of the most respected commentators on management theory and practice, believed that “knowledge worker productivity” would be the next frontier of management. Drucker was also famous for his quote “If you can’t measure it, you can’t manage it”, to which I would add the following prefix, “f you cant understand it, you can’t measure it.” Building a common understanding and framework(s) for knowledge management is essential in determining meaning, relevancy, relationship or other characteristics of information within contextual and cultural settings. We need to be able to detect when ambiguities and obfuscations are intended and make a documented judgement on meaning when they are not.
2. Networking – it might be stating the obvious to point out that people, individually and collectively, lie at the heart of the global social community. And it stands to reason that knowing who is who, and what they know is another fundamental layer needed for success. The size and complexity of big social data demands a superior set of skills that can identify, analyze, classify and then connect individuals to each other and their knowledge sets. I described this in my previous post Network Weavers which attempted to define the needed attributes (acquisition: filtration/review: association: curation: construction). As the dimensions of the network, the participants and their contributions grow so will the level of skills, and proficient network weavers will become more of a premium resource than they are today. It is likely that networkers will depend on directories, personal or even corporate at first, but increasingly the directories will become more public and entries will contain more social information such as skills, contributions, preferences and factors that others will be able to use to determine relevancy and fit for purpose.
3. Analytics -With improved understanding of knowledge and how we use and abuse it, we can approach analysis with a higher level of confidence in the accuracy of our observations. There are techniques and technologies that attempt to extract meaning from unstructured data but they still fall short of the human computer that is the brain when it comes to analyzing written and visual communications. As with humans machine semantics are bounded by self imposed rules and definitions, and like humans, communication is improved if there is an agreed set between participating bodies. If those rules and definitions remain hidden and obscured then the output can only be regarded as personal opinion. Rating the relevancy or social worthiness of an individual or entity against undisclosed rules and definitions has as much value as the street corner tipster who whispers a sure fire winner for any given horse race. Consequently social media demands semantic definitions that are shared amongst correspondents and a semantic analysis engine with the flexibility to parametrize selected characteristics so that relevancy can be tuned to group or community objectives.
4. Curation – In an earlier post, Curation – In Need of a Cure I raised the need for knowledge workers to approach the care and maintenance of Social Media information in the same way that enterprises manage their data through Information Lifecycle Management. It is not enough just to store knowledge as we do currently with Pinterest, Tumblr, scoop.it and others: beyond catching the item in our personal butterfly net, our efforts resemble little more than childhood scrapbooks of things that caught our interest and appetites. Curation is an excellent term for the housekeeping that needs to be performed on the captured knowledge data. In museums and art galleries curation is a highly sophisticated skill set that seeks to first isolate the item of knowledge, then to expand it with information about its provenance (where it came from) and pedigree (eg what school of thought), augment it with related content (supporting and detracting) and finally exhibit it to educate and edify an interested audience. Curation is an essential component in building a rich and relevant knowledge base, and can and often does lead to new insights and innovations.
5. Collaboration – Unlike “Field of Dreams” you can’t just build a field and expect the games to begin. All the understanding, networking, analyzing and curating will bring but small value if you keep it all to yourself. The key to success lies in participation. The more you contribute, the greater value you generate both for yourself and for your correspondents. The root of the word collaboration is “labor” , meaning work or effort, and the prefix “Co” means sharing. The more you share and contribute the more you will be rewarded by your involvement with the social network. You will be further rewarded as others do the same, whether its contributing common rules and definitions, understanding of knowledge and thought, the names and skills of great social network participants, or exemplary curation of well defined and related content. It is the act of collaboration that provides the secret sauce of success and bridges the resources and knowledge in the social stream. This is not theory: this is proven without any shadow of doubt by the open source community. If you get the opportunity, interact with an open source contributor, and ask them for guidance; they have been doing it effectively, efficiently and profitably for more than a decade.
WARNING: Please don’t attempt any of the steps above without clear and careful planning
- Search is Not Enough: Using Solr for Analytics (architects.dzone.com)
- Examples to help clarify what’s unstructured data and what’s structured? (parasdoshi.com)
- Tackling that unstructured data mess, practically (infocus.emc.com)
- Visualising The Future – New Techniques will revolutionise understanding and interpretation of ‘big data’ (blog.bt.com)