Information Architecture and Big Data Analytics

December 13, 2017

Dr. Anthony J. Rhem | A.J. Rhem & Associates

Information Architecture is an enabler for Big Data Analytics. You may be asking why I would say this, or how does IA enable Big Data Analytics? We need to remember that Big Data includes all data (i.e., Unstructured, Semi-structured, and Structured). The primary characteristics of Big Data (Volume, Velocity, and Variety) are a challenge to your existing architecture and how you will effectively, efficiently and economically process data to achieve operational efficiencies.

In order to derive the maximum benefit from Big Data, organizations must be able to handle the rapid rate of delivery and extraction of huge volumes of data, with varying data types. This can then be integrated with the organization’s enterprise data and analyzed. Information Architecture provides the methods and tools for organizing, labeling, building relationships (through associations), and describing (through metadata) your unstructured content adding this source to your overall pool of Big Data. In addition, information architecture enables Big Data to rapidly explore and analyze any combination of structured, semi-structured and unstructured sources. Big Data requires information architecture to exploit relationships and synergies between your data. This infrastructure enables organizations to make decisions utilizing the full spectrum of your big data sources.

Big Data Components

IA Element Volume Velocity Variety

Content Consumption

Provides an understanding of the universe of relevant content through performing a content audit. This contributes directly to volume of available content.

This directly contributes to the speed at which content is accessed by providing initial volume of the available content.

Identifies the initial variety of content that will be a part of the organization's Big Data resources.

Content Generation

Fill gaps identified in the content audit by Gather the requirements for content creation/ generation, which contributes to directly to increasing the amount of content that is available in the organization's Big Data resources.

This directly contributes to the speed at which content is accessed due to the fact that volumes are increasing.

Contributes to the creation of a variety of content (documents, spreadsheets, images, video, voice) to fill identified gaps.

Content Organization

Content Organization will provide business rules to identify relationships between content, create metadata schema to assign content characteristic to all content. This contributes to increasing the volume of data available and in some ways leveraging existing data to assign metadata values.

This directly contributes to improving the speed at which content is accessed by applying metadata, which in turn will give context to the content.

The Variety of Big Data will often times drive the relationships and organization between the various types of content.

Content Access

Content Access is about search and establishing the standard types of search (i.e., keyword, guided, and faceted). This will contribute to the volume of data, through establishing the parameters often times additional metadata fields and values to enhance search.

Contributes to the ability to access content and the speed and efficiency in which content is accessed.

Contributes to how the variety of content is access. The Variety of Big Data will often times drive the search parameters used to access the various type of content.

Content Governance

The focus here is on establishing accountability for the accuracy, consistency and timeliness of content, content relationships, metadata and taxonomy within areas of the enterprise and the applications that are being used. Content Governance will often "prune" the volume of content available in the organization's Big Data resources by only allowing access to pertinent/relevant content, while either deleting or archiving other content.

When the volume of content available in the organization's Big Data resources is trimmed through Content Governance it will improve velocity by making available a smaller more pertinent universe of content.

When the volume of content available in the organization's Big Data resources is trimmed through Content Governance the variety of content available may be affected as well.

Content Quality of Service

Content Quality of Service focuses on security, availability, scalability, usefulness of the content and improves the overall quality of the volume of content in the organization's Big Data resources by:
- defending content from unauthorized access, use, disclosure, disruption, modification, perusal, inspection, recording or destruction
- eliminating or minimizing disruptions from planned system downtime
making sure that the content that is accessed is from and/or based on the authoritative or trusted source, reviewed on a regular basis (based on the specific governance policies), modified when needed and archived when it becomes obsolete
- enabling the content to behave the same no matter what application/tool implements it and flexible enough to be used from an enterprise level as well as a local level without changing its meaning, intent of use and/or function
- by tailoring the content to the specific audience and to ensure that the content serves a distinct purpose, helpful to its audience and is practical.

Content Quality of Service will eliminate or minimize delays and latency from your content and business processes by speeding to analyze and make decisions directing effecting the content's velocity.

Content Quality of Service will improve the overall quality of the variety of content in the organization's Big Data resources through aspects of security, availability, scalability, and usefulness of content.

The table above aligns key information architecture elements to the primary components of Big Data. This alignment will facilitate a consistent structure in order to effectively apply analytics to your pool of Big Data. The Information Architecture Elements include; Content Consumption, Content Generation, Content Organization, Content Access, Content Governance and Content Quality of Service. It is this framework that will align all of your data to enable business value to be gained from your Big Data resources.

Note: This table originally appeared in the book Knowledge Management in Practice (ISBN: 978-1-4665-6252-3) by Anthony J. Rhem.

About the author:

Dr. Anthony J. Rhem leads the KM Institute's "Information Architecture and ECM" training and certification program. The next class to be offered in the DC area is planned for March 19-22 (click for more info). He serves as the President of A.J. Rhem & Associates, Inc., a privately held Information Systems Integration and Training firm located in Chicago, Illinois. Dr. Rhem is an Information Systems professional with over thirty (30) years of experience, a published author, and educator, presenting the application and theory of Software Engineering Methodologies, Knowledge Management, and Artificial Intelligence.

In addition Dr. Rhem serves as a Professor of Knowledge Management and Director of Research at The Knowledge Systems Institute - Master of Science Knowledge Management Program.

Back to main blog

Search for blog posts

Knowledge Management (KM) Certification & Training

Training
Certification
Transformation

TOPICS

AI and Conversational KM

7/16/2026

The Power of Random Conversations - Creating the Conditions for Knowledge to Spread Through Conversation

7/8/2026

The KM Wake-Up Call: When Is It Too Late?

6/30/2026

How Knowledge Management Teams Can Improve Training, Documentation, and Internal Process Adoption

6/17/2026

The 4 Questions Every Important Decision Should be Able to Answer

6/16/2026

Information Architecture and Big Data Analytics

Big Data Components

Knowledge Management (KM) Certification & Training

Training
Certification
Transformation

TOPICS

Agile and Design Thinking

Artificial Intelligence and KM

Change Management and Organization Development

Communities of Practice (CoPs)

Content Management

Conversational Leadership

Creative KM

Customer Experience/UX

Info Architecture, Taxonomy, Ontology

Innovation Management

KM in General

KM Metrics

KM Strategy and Design

KM Tech and Trends (Tools, Apps, Digital Workplace)

Knowledge Centered Support (KCS)

Knowledge Graphs and Data

User Adoption and Engagement

RECENT POSTS

AI and Conversational KM

The Power of Random Conversations - Creating the Conditions for Knowledge to Spread Through Conversation

The KM Wake-Up Call: When Is It Too Late?

How Knowledge Management Teams Can Improve Training, Documentation, and Internal Process Adoption

The 4 Questions Every Important Decision Should be Able to Answer

How to Contact Us

What's Coming Up

Certified AI & KM Professional Class for North America

Certified Knowledge Manager (CKM) for North America

Information Architecture and Big Data Analytics

Big Data Components

Knowledge Management (KM) Certification & Training

TrainingCertificationTransformation

TOPICS

Agile and Design Thinking

Artificial Intelligence and KM

Change Management and Organization Development

Communities of Practice (CoPs)

Content Management

Conversational Leadership

Creative KM

Customer Experience/UX

Info Architecture, Taxonomy, Ontology

Innovation Management

KM in General

KM Metrics

KM Strategy and Design

KM Tech and Trends (Tools, Apps, Digital Workplace)

Knowledge Centered Support (KCS)

Knowledge Graphs and Data

User Adoption and Engagement

RECENT POSTS

AI and Conversational KM

The Power of Random Conversations - Creating the Conditions for Knowledge to Spread Through Conversation

The KM Wake-Up Call: When Is It Too Late?

How Knowledge Management Teams Can Improve Training, Documentation, and Internal Process Adoption

The 4 Questions Every Important Decision Should be Able to Answer

How to Contact Us

What's Coming Up

Certified AI & KM Professional Class for North America

Certified Knowledge Manager (CKM) for North America

Training
Certification
Transformation