You are currently browsing the tag archive for the ‘Knowledge’ tag.
January 5, 2013 in Big Data, Frameworks, Governance, Information Lifecycle Management, Social Media, Technology Tools | Tags: collaboration, Directories, Knowledge, Lists, meritorcracy, Open Source, people database, rating system, Social media, Web search engine | Leave a comment
First published in Internet Media Labs Blog – 27th October 2012
We are amassing data at an unprecedented rate. In the course of a day the internet handles more than 1,000 Petabytes of data (2011 figures) and is projected to double in less than three years. That’s a million terabytes or a billion gigabytes just on the public internet alone. Granted there is a lot of duplication and the amount of image and video content is greatly contributing to the accelerated growth. Furthermore our growing dependency on mobility demands even greater participation and production that further magnifies digital traffic.
That is a lot of data and a very large amount of noise carrying a decreasing ratio of signal. How do we operate in such an environment and meet our objectives for education, career, parenting, healthcare, community participation, consumerism and entertainment? How do we locate and recognize the availability and qualities of resources that will help us live our lives productively and successfully?
A complex question no doubt, but one that highlights the current capabilities and shortcomings of the network today.
The short and most common answer would be search engines. To a degree that is a reasonable response, but given the immensity of available data it is woefully short of satisfying anything but the last two on my list of objectives (consumerism and entertainment).
The issue starts with search engines and the demands of commercialism. Commerce sustains our civilization and provides the impetus for innovation and discovery. But it also dominates the way we create and prepare content, and the way we search for information. We are also largely dependent on a single search engine, which is still evolving though firmly rooted in textual analysis. Yes there are other search options but the majority of us use Google.
Search technology is beginning to branch out as witnessed by Google’s goal of producing a knowledge graph. Currently it has the ability to determine sentiment which is the first step in semantic analysis. Yet there is a long way to go before search can provide an accurate return on how, what and who we are searching for.
Google spends a lot of capital on developing and improving search algorithms, which are obscured to prevent gaming the system. Those algorithms perform a large number of calculations that include the analysis and synthesis of web content, structure and performance.
Providers of content and information are aware that they can improve the ranking of their published material by optimizing their web site through Search Engine Optimization (SEO), Conversion Rate Optimization (CRO) or improving the quality and attractiveness of their content. In addition the search engine vendor(s) provide consulting services to assist content providers in achieving approved “white hat” SEO status as opposed to “black hat” SEO which is risky, unapproved, and has the potential to be banned.
Any search results in an index of entries ranked by how well they have been produced and optimized. The more content humankind produces the more commercial entities will spend in order to ensure high ranking so that we consume their products or services, after all few consumers go beyond the first page of search results. Hence my assertion above that consumerism and entertainment (which for sake of argument includes news and events) are the principal beneficiaries of the current solutions. And that’s great if you are catching up on news, wish to be entertained or shopping either actively or casually. The ranking system will give you the most up to date, the most popular and the most advertised consumables.
However the ranking system doesn’t scale down for the individual, the community or small businesses or enterprises, unless predetermined keywords are used in the content and search. A small voice cannot be heard where shouting is encouraged even demanded. The more we use search engines the louder that shouting becomes. Furthermore the ranking system doesn’t really scale economically for SEO content as globalization will introduce more competition for the coveted top ranked entries, demanding increased effort and optimization.
But this post is not about search engines and optimization of content. It’s about locating resource and identifying quality and relevancy that will help in collaboration; finding people, ideas, material, skills and availability so the other objectives on my list can be fulfilled.
We need something more than simple signposts or lists, valuable as they are. We need a capability that will not only locate a resource, but one that will also provide us with much needed information about the resource, its properties, location, status, history and relationships to other resources. In short we need directories, repositories of resources and their attributes that are easily accessible and extensible.
Directory databases have been around for a long time and are currently in operation in most large enterprises, most commonly behind corporate firewalls. They meet many of the requirements outlined above, although their use has been necessarily constrained to a management and security function. In most implementations they perform that function well. That style of directory is also appropriate beyond the firewall, especially when authentication amongst diverse communities and populations needs to be supported.
Yet we can do so much more with directories, especially if we liberate their extensibility and open them up to collaborative contributions and housekeeping. Today we keep our own lists and collaborate on those in communities of interest. There are several listing applications on Social Media such as list.ly, Twitchimp or the late lamented Formulists. These are great applications and no social media maven can exist without one. But they are only lists and they only carry a small number of entries and attributes.
Open collaborative directories will be able to scale to support large numbers of entries and attributes, including attributes that are determined by the participants and their communities. In other words directories will carry the hard facts about a resource as well as attributes that are determined by those who use and collaborate with those resources.
This is very similar to Facebook’s like, (and imaginary don’t like), but applied to the performance or quality of resource as experienced in collaboration. Such peer review and measurement lies at the heart of Open Source development, a meritocracy where your contributions are evaluated by peers to determine your value and position within the group. Such information will prove invaluable to those seeking knowledge and the resources to get things done.
And why stop at people? Open Collaborative Directories can support any resource be it curated knowledge bases, dictionaries, almanacs and compendiums.
As long as they are open and accessible they will serve and be served by the communities that need them. Because directory searches have little need for ranking they will be the first port of call for those who want more than the latest news or consumable.
Data image via Tom Woodward in Flickr Creative Commons
June 12, 2012 in Big Data | Tags: Analysis, BigData, collaboration, Curation, Data Definition, Data model, ILM, Knowledge, Knowledge Management, Lifecycle Management, Meaning, Networking, Open Source, Peter Drucker, Semantics, Social media, Social Media Value, Social Network, Tony Walker, Unstructured data | Leave a comment
The more we know the less we understand. Nowhere is this more true than on the Social Network, where volume, velocity, volatility and variability are increasing on a daily basis. Those 4 V’s are part of a definition of big data, which includes both structured and unstructured data. We may have a reasonable chance of obtaining valuable information from the structured data population. That depends, of course, on the extremity of any single one or combination of the 4 Vs, yet author, time stamp, location or any other tag that accompanies a communication is easily identifiable. Howerver unstructured data poses a challenge several orders of magnitude greater. Structured data benefits from data models, data definitions and rules that enable us to extract reports and analyses even to the point of discovering new relationships and information from the regimented data. To do so, we need to nurture and maintain these structures, to prevent a degradation of data quality and avoid conflicts and loss, a goal that often eludes the best efforts even in mature IT shops. However this is not the case for unstructured data.
In general there are no data models, no data definitions, no rules and no discipline of housekeeping for unstructured data in Social Media. At least nothing that is commonly held. Individually, of course, we have an idea of what we are communicating, and we probably use both our own data definitions as well as those we assume are being used by others in any conversation; but these are amorphous concepts and certainly nothing that can be referenced by others or by cyber analysis. The same is true to a lesser degree in IT organizations and the worlds behind the firewalls. At least in those environments best practices such as change management and planned organization of unstructured data (viz Sharepoint) should ensure some semblance of control and order if not insights into hidden information.
We do however have some rudimentary tools at our disposal, but like early man our technical bows and arrows are a poor match against the stampeding herd of beasts that is the social network stream. So like our ancient ancestors we have to develop strategies and skills that help us survive and thrive in this world of pervasive communications. Tony Wagner, author of “The Global Achievement Gap” identified three such skills that he believes are fundamental for us to foster and teach. He calls them the “three C’s – critical thinking, effective oral and written communication, and collaboration.” He also believes that this should be the prime focus of our educators, and that we should establish “a new National Education Academy, modeled after our military academies, to raise the status of the profession and to support the R and D that is essential for reinventing teaching, learning and assessment.”
Knowing how to perform the three C’s is therefor one of the keys to success. Being able to put this knowledge into practice, and bring organization and governance to bear on the resources and data requires additional skills if enterprises plan to approach and consume the labor and thoughts of distributed social resources.
Taking these observations a little further I believe the following 5 components are necessary in order to navigate, participate and collaborate in world of social information.
1. Understanding – we need a better understanding of what we are dealing with in the social media so that we can properly distinguish and farm target crops whether they are preferences, demographics, opinions, gossip, information, knowledge, wisdom. or something altogether different. However to improve that comprehension we need to be more aware of the dynamics of how we think, analyze, and communicate effectively. What, for example, is a thought, and what are the attributes of thought that make it consumable? We have a notion of answers to those questions but they are personal and subjective. Yet we cannot rely solely on subjective interpretation, so we need a shared and objective framework or model of knowledge. Knowledge is the loadstone of the social community, and the more we understand it, its nature, behaviors and properties the more we can improve the discovery, sharing and use of valued information in the social stream.
Peter Drucker(1909-2005), one of the most respected commentators on management theory and practice, believed that “knowledge worker productivity” would be the next frontier of management. Drucker was also famous for his quote “If you can’t measure it, you can’t manage it”, to which I would add the following prefix, “f you cant understand it, you can’t measure it.” Building a common understanding and framework(s) for knowledge management is essential in determining meaning, relevancy, relationship or other characteristics of information within contextual and cultural settings. We need to be able to detect when ambiguities and obfuscations are intended and make a documented judgement on meaning when they are not.
2. Networking – it might be stating the obvious to point out that people, individually and collectively, lie at the heart of the global social community. And it stands to reason that knowing who is who, and what they know is another fundamental layer needed for success. The size and complexity of big social data demands a superior set of skills that can identify, analyze, classify and then connect individuals to each other and their knowledge sets. I described this in my previous post Network Weavers which attempted to define the needed attributes (acquisition: filtration/review: association: curation: construction). As the dimensions of the network, the participants and their contributions grow so will the level of skills, and proficient network weavers will become more of a premium resource than they are today. It is likely that networkers will depend on directories, personal or even corporate at first, but increasingly the directories will become more public and entries will contain more social information such as skills, contributions, preferences and factors that others will be able to use to determine relevancy and fit for purpose.
3. Analytics -With improved understanding of knowledge and how we use and abuse it, we can approach analysis with a higher level of confidence in the accuracy of our observations. There are techniques and technologies that attempt to extract meaning from unstructured data but they still fall short of the human computer that is the brain when it comes to analyzing written and visual communications. As with humans machine semantics are bounded by self imposed rules and definitions, and like humans, communication is improved if there is an agreed set between participating bodies. If those rules and definitions remain hidden and obscured then the output can only be regarded as personal opinion. Rating the relevancy or social worthiness of an individual or entity against undisclosed rules and definitions has as much value as the street corner tipster who whispers a sure fire winner for any given horse race. Consequently social media demands semantic definitions that are shared amongst correspondents and a semantic analysis engine with the flexibility to parametrize selected characteristics so that relevancy can be tuned to group or community objectives.
4. Curation – In an earlier post, Curation – In Need of a Cure I raised the need for knowledge workers to approach the care and maintenance of Social Media information in the same way that enterprises manage their data through Information Lifecycle Management. It is not enough just to store knowledge as we do currently with Pinterest, Tumblr, scoop.it and others: beyond catching the item in our personal butterfly net, our efforts resemble little more than childhood scrapbooks of things that caught our interest and appetites. Curation is an excellent term for the housekeeping that needs to be performed on the captured knowledge data. In museums and art galleries curation is a highly sophisticated skill set that seeks to first isolate the item of knowledge, then to expand it with information about its provenance (where it came from) and pedigree (eg what school of thought), augment it with related content (supporting and detracting) and finally exhibit it to educate and edify an interested audience. Curation is an essential component in building a rich and relevant knowledge base, and can and often does lead to new insights and innovations.
5. Collaboration – Unlike “Field of Dreams” you can’t just build a field and expect the games to begin. All the understanding, networking, analyzing and curating will bring but small value if you keep it all to yourself. The key to success lies in participation. The more you contribute, the greater value you generate both for yourself and for your correspondents. The root of the word collaboration is “labor” , meaning work or effort, and the prefix “Co” means sharing. The more you share and contribute the more you will be rewarded by your involvement with the social network. You will be further rewarded as others do the same, whether its contributing common rules and definitions, understanding of knowledge and thought, the names and skills of great social network participants, or exemplary curation of well defined and related content. It is the act of collaboration that provides the secret sauce of success and bridges the resources and knowledge in the social stream. This is not theory: this is proven without any shadow of doubt by the open source community. If you get the opportunity, interact with an open source contributor, and ask them for guidance; they have been doing it effectively, efficiently and profitably for more than a decade.
WARNING: Please don’t attempt any of the steps above without clear and careful planning
- Search is Not Enough: Using Solr for Analytics (architects.dzone.com)
- Examples to help clarify what’s unstructured data and what’s structured? (parasdoshi.com)
- Tackling that unstructured data mess, practically (infocus.emc.com)
- Visualising The Future – New Techniques will revolutionise understanding and interpretation of ‘big data’ (blog.bt.com)