Redditprofile | Metacortex logo


RedditProfile.com is a demonstration of the Metacortex.


Metacortex is a platform which identifies & organizes the knowledge and mood of people within any organization or network; utilizing only comments sent to it's ingestion endpoint. This enables the search expertise, content, mood tracking, trends, and more. Accessible through either a web interface or API.


Metacortex explanation


Imagine if Reddit was a large company, with many remote workers and dispersed teams:

  • How would you identify duplicate work?
  • Who has expertise in the area you need help in?
  • Where is documentation about an internal project?
  • How is company morale?
  • How do people feel about Java?
  • etc, etc...


That's where Metacortex comes in!

Just send discussions to the ingestion API, it automatically identifies expertise, ranks content, track moral, and more. For more technical details, see how it works.


In addition to search, it's also possible to use the APIs to integrate into third party applications, such as:

  • Duplicate effort notifications
  • Employee satisfaction monitoring
  • Insider threat detection
  • Team management


API: Ingestion

Below is an example of a comment from a discussion sent to our server:


The system will then respond, unless otherwise specified, with the keywords (i.e. subject matter), sentiment, and a score (numerical measure of sentiment):


After ingesting comments, you can use our search application, exactly like this demo (redditprofile.com).

For reference, to setup this demo all we had to do was send Reddit comments directly to our ingestion endpoint (just like the example above), our system did the rest from there!


In addition to search, it's also possible to use the APIs to integrate into third party applications (examples on about page). The APIs available today are as follows:

  • Trends - Search how often a topic is discussed
  • Expert Opinion - Opinions of experts on a given topic
  • Promotional Score - How the general populous is feeling about a topic

  • Comments - Search all the comments, keywords, sentiment, and more
  • Content - Search all the content, ranked by importance
  • Author_Profiles - Search the profiles of those within the network

For further details, feel free to contact us!


Deployment & Scalability

In addition to our API, we believe all systems should be easy to deploy. Thus, if you can deploy a web application in your organization, you can deploy Metacortex.

Metacortex is composed of three components:

  • Ingestion - A Python Flask App, which can scale to as needed. Requires 220 Mb of RAM each, each app can handle an average of 70 comments per second with a 2.2Ghz CPU (can have one app per thread).

  • Query API - A Ruby Rails App, which can scale as needed. Requires 4 Gb of RAM each, formats responses, hosts the basic UI as well (this demo).

  • Database - A PostgreSQL database, can scale up to the comments of a 100,000 person company on a single instance. This demo is using an AWS RDS db.m4.xlarge (using general purpose SSD). Primarily IOPS bound, the current setup can handle up to 100 million comments without issue. With the ability to scale up as needed.


All of the components above are easy to deploy and scale very well. For reference, the current demo contains 25 million comments, and the platform can process up to 2800 comments per second.

Yet, only costs $600 / month in infrastructure on AWS (with on-demand instances).


Expert Rank

We built this system by categorizing experts online and analyzing their discussions through data mining, sentiment analysis, and more generally machine learning. We call our algorithm for ranking ExpertRank, paying homage to PageRank.

Without going into too much detail here. ExpertRank works by determining an authors expertise in a network. Then using their opinions around their expertise to rank a topic or piece of content. This is in contrast to PageRank which weights pages, domains, or institutions as credible, we rank people.

This also means, we never actually look at the content being discussed! Quite literally, all the content that appear in the searches on the demo only appear because experts are discussing them, as it relates to their expertise.

Note: The current system only has Reddit comments, these only consist of a few comments a week (max) for the typical user. Imagine this system in a company, with hundreds of more comments per user per week. The system's accuracy grows substantionally.