This guide contains the information you’ll need to start working with Receptiviti API. It covers all relevant information concerning:
– Connecting to the API
– Sending text samples
– Receiving analysis of text samples
– Interpreting results
– Sample code
This document is intended for people responsible for the integration and operation of APIs.
1. Terms and Definitions
API Application Programming Interface.
API Key A code generated by the API server, which is unique to a particular client. The API cannot be used without a valid API key.
API Secret Key A password that is used to authenticate a user. The Secret Key will be generated/regenerated when the user clicks ÒGet API KeyÓ button.
Writing Sample Text of natural language from a person or a brand that can be submitted to the Receptiviti API for analysis. Currently, Receptiviti API supports English language text only.
Client Reference ID A unique identifier that a client uses to link a writing sample to a particular record on the client system (i.e. a name, contact, social handle, survey respondent).
Content Source Receptiviti API requires the client to indicate the source of the text sent to the API (i.e. personal writing, personal email correspondence, professional correspondence, social media, commercial writing, professional/scientific writing, anonymous review, twitter).
Swagger Receptiviti uses Swagger (the world's most popular framework for APIs) for our API documentation
2. Two ways to interact with Receptiviti
Receptiviti provides users with two different ways to use our analytics engine:
A. REST API: Receptiviti provides a REST API to enable clients to analyze large quantities of text on an ongoing or programmatic basis. To use the API you will need both an API Key and an API Secret Key. See “API credentials” below for details on retrieving your keys.
B. Web-Based User Interface: Receptiviti provides users with a web-based user interface that includes a text field in which users can paste raw language text for analysis. The user interface also features a range of outputs – most of which are included in the API outputs. The user interface is intended to provide insight into Receptiviti API capabilities and to experiment with API features. The user interface should not be used for ongoing programmatic analysis of large volumes of text.
3. API credentials
To use Receptiviti API you will need both an API Key and an API Secret Key. Click the “Get API Key” button at the top of the Home page to retrieve your API Key and API Secret Key. A .csv file will be generated that contains both keys.
Receptiviti’s Web-based User Interface does not require an API Key or an API Secret Key.
Any standard REST library can be used to talk with Receptiviti API. Swagger-codegen can be used to generate client libraries in different languages by pointing the tool to our Swagger JSON. Alternatively, you can try out the API on our Swagger page.
5A. People API Inputs (for LIWC and Receptiviti analysis)
The Receptiviti People API enables users to analyze individual writing samples, or to submit and timestamp multiple writing samples from a person or from multiple people. Submitted timestamped scores can be retrieved to facilitate time-series analysis of a person or multiple people.
When using the People API, users can create a Person or multiple People and post writing samples to be associated with (written by) that person.
WHEN CREATING A PERSON THROUGH THE PEOPLE API, POSTS SHOULD INCLUDE THE FOLLOWING:
Name: Name of person (or instead of a name, a unique identifier of your choosing)
Gender: Gender of the Person
“1” denotes female
“2” denotes male
“0” denotes N/A
Person_ID (optional): Is a unique identifier that can be used to identify the author of the writing sample. Suggestions include an email address, social handle ID from a CRM system, etc. Alternatively, any form of custom codification will suffice. Users cannot create two people with the same Person_ID.
Person_Tags (optional): One or more strings of text that can be used to classify the text sample being submitted, or to classify the author of the submitted text sample.
Content (optional): When creating a person, a writing sample can be provided for that person to create both the person and and a writing sample associated with them at the same time. See the Writing Sample section (below) for details.
WHEN SUBMITTING A WRITING SAMPLE FOR A PERSON, POSTS SHOULD INCLUDE THE FOLLOWING:
Language_Content: This is the text writing sample that is posted to be analyzed. Content samples should be natural language from a person, or a person’s social posting, messages, voice transcript, personal writing, personal email correspondence, professional correspondence, social media, commercial writing, professional/scientific writing or other such types of language. For users who wish to analyze the personality of a brand, Content can also be messaging from a brand (sourced from brand-developed content, a website, marketing collateral, etc.). Notes:
The content sample should be from a single individual. Content from two or more people should not be merged and sent to the API as a single post.
Content_ID (optional): A unique identifier that can be used to identify each submitted writing sample. Typically, a tweet ID, an email message ID, or some form of unique ID that can be used to identify each individual writing sample.
Content_Date (optional): By default, if the user doesn’t include a date with the post the system will automatically timestamp the sample with the date and time of submission. The user can override the automatic timestamp by providing a date and time with the sample. Overriding automatic time stamping can be useful, for example, if there is a need to backdate writing samples.
Date and Time format is defined by ISO 8601:2000 data elements and interchange formats; complete date plus hours and minutes: YYYY-MM-DDThh:mmTZD (e.g. 1997-07-16T19:20+01:00)
Content_Tags (optional): One or more strings of text that can be used to classify the text sample being submitted.
Content_Source (integer): This is the source of content:
“1” denotes Personal Writing
“2” denotes Personal Email Correspondence
“3” denotes Professional Correspondence
“4” denotes Social Media
“5” denotes Commercial Writing
“6” denotes Professional or Scientific Writing
“7” denotes Anonymous Review Text
“8” denotes Twitter Content
“0” denotes other
Language (string): Receptiviti API supports English and Spanish. LIWC scores can be generated for samples in both English and Spanish, however, at this time Receptiviti scores are only generated from English language samples.
When submitting samples, specify the language of the sample being submitted using the values below. English is the default language, so if language is not specified, the API will mark the sample as English.
Specify one of the supported languages for each sample submitted:
Recipient_ID (string): If the writing sample was a conversation with another person who has been created in the system, provide the Recipient_ID for that person. This data will be used when calculating LSM (details below).
Input Channels: We strongly advise against merging language from multiple sources (Twitter, Facebook, survey response, transcript) into one post. If multiple channels of content are to be analyzed, we recommend creating a separate Person_ID for each type of content. For example, to analyze Person A’s Twitter and Facebook streams, create one “Person_A_Twitter” Person_ID to which only Person A’s Twitter data is sent, and one “Person_A_Facebook” Person_ID to which only Person A’s Facebook data is sent.
5B. LSM API Inputs
The Receptiviti LSM API enables users to evaluate how much rapport exists between two people.
LSM is based on research by James W. Pennebaker into the relationship between rapport and the degree to which people show similar rates of usage of function words when they communicate with each other. Identifying the commonalities and differences in their relative use of function words helps us to understand the degree to which the two people are in sync.
Research has shown that LSM is associated with the quality of interpersonal relationships, how long relationships last, and it can even predict the formation of relationships. In speed dating environments, LSM has out-predicted couples themselves when evaluating the likelihood of mutual romantic interest, and it’s also been used to predict the likelihood of divorce among married couples.
In order to calculate LSM, the recipient must be specified when a person’s writing sample is submitted. LSM requires between 500 and 750 words from each person being analyzed.
WHEN REQUESTING LSM SCORE THROUGH THE LSM API, REQUESTS SHOULD INCLUDE THE FOLLOWING IN THE URL:
Person 1 (string): The Receptiviti ID for person 1.
Person 2 (string): The Receptiviti ID for person 2.
From Date (ISO Date format): The first date from which writing samples from the two individuals are to be evaluated
To Date (ISO Date format): The last date until which writing samples from the two individuals are to be evaluated
Tags (ISO Date format): To include writing samples with the specified tags
Language (string): To include samples with the specified language (English or Spanish)
5C. Importing A Spreadsheet
If you’re not a developer and prefer to import your data from a spreadsheet, Receptiviti provides an option to import a .csv file.
Check out this video on how you can.
WHEN IMPORTING A CSV, THE FOLLOWING FIELDS ARE AVAILABLE FOR MAPPING:
Name (optional): Name of person (or instead of a name, a unique identifier of your choosing)
Gender (optional): Gender of person
“1” denotes female
“2” denotes male
“0” denotes N/A
Person Handle (optional, will auto-generate if not provided): A unique identifier for the author of the content, e.g. an email address, social media handle, etc. Any form of unique code will suffice. Two people can not share the same Person Handle.
Person Tags (optional): One or more text strings, separated by semicolons. These can be used to classify or group the person among all people submitted. Examples of common tags include department or team in an organization, or metadata to correlate with results such as performance reviews or demographic factors.
Content (required): Text content to be analyzed. Content samples should be natural language from a person. Examples of this include but are not limited to: social media postings, messages, transcripts of spoken language, email correspondence, commercial writing, etc. Text should contain only ascii characters and should not contain encoded line breaks. Every sentence within the text should be separated by a punctuation mark. Text should not include emojis or any other non-text-based elements, but may include emoticons.
Content Handle (optional, will auto-generate if not provided): A unique identifier for each content sample. Two content samples cannot have the same Content Handle.
Content Date (optional): The date on which the content sample was written/spoken. If omitted, this will default to the date and time of CSV upload. Date and Time format is defined by ISO 8601:2000 data elements and interchange formats; complete date plus hours and minutes:
YYYY-MM-DDThh:mmTZD (e.g. 1997-07-16T19:20+01:00)
Content Tags (optional): One or more text strings, separated by semicolons. These can be used to classify or group the content sample among all content submitted. Examples of common tags include purpose, situation, or topic of content.
Content Source (optional): The source of the content, indicated by an integer. Options are:
“1” denotes Personal Writing
“2” denotes Personal Email Correspondence
“3” denotes Professional Correspondence (email or other)
“4” denotes Social Media (other than Twitter)
“5” denotes Commercial Writing
“6” denotes Professional or Scientific Writing
“7” denotes Anonymous Review Text
“8” denotes Twitter Content
“0” denotes other
Content Language (optional): Receptiviti currently supports English and Spanish for all users. LIWC scores can be generated for samples in both English and Spanish, but Receptiviti scores are only generated from English language samples at this time. If language is not specified, content will be scored in English. Specify one of the supported languages for each content sample submitted, in the form of a string:
Recipient Handle (optional): Associates the content with another person already created for the purpose of analyzing the relationship between two people. Note that a Recipient Handle must be identical to a person who is already present in the system.
6. Sample Size Requirements
Receptiviti API requires a minimum number of words to generate statistically significant results. For LIWC analysis, we recommend a minimum of 50 words. For Receptiviti analysis, we recommend a minimum of 300 words. LSM requires between 500 and 750 words from each person being analyzed. For both LIWC and Receptiviti analysis, if your language sample is smaller than the recommended minimum word count, be careful in interpreting results.
Receptiviti API outputs are comprised of three distinct types of analysis – LIWC outputs, Receptiviti outputs and LSM Outputs:
7A. Receptiviti Outputs
Receptiviti API generates 20+ additional psychological measures (called Receptiviti measures). Receptiviti measures are presented in three ways: As raw scores (the results of the Receptiviti algorithms, as scores on a100-point scale ranging from 0 to 100 (scores were scaled in the same manner as the LIWC summary variables), and on a 5-point scale based that utilizes a logarithmic distribution of results. At this time, Receptiviti scores can only be generated from English language samples.
Receptiviti outputs are grouped into the five types of psychological insights they represent: Cognitive/Thinking Style Insights, Big 5 Insights, Social Style Insights, Emotional Style Insights, and Working Style Insights. Definitions of the outputs are available in the table below.
The Receptiviti outputs are:
Receptiviti Measure Definition
Cognitive/Thinking Style Insights:
Thinking Style Measures the degree to which the person is an analytical thinker who relies on facts and data or instinct and feelings when making decisions.
Persuasive Measures the degree to which a person is able to create rapport with the intention of persuading others.
Reward Bias Measures the degree to which a person weighs risks vs. rewards when making decisions.
Big 5 Insights:
Openness Measures the degree to which a person is open to new ideas and new experiences.
Artistic Measures how much a person appreciates and enjoys the arts.
Intellectual Measures how strongly a person is inclined toward intellectual and academic learning.
Liberal Measures how socially and ideologically liberal a person is.
Imaginative Measures to what degree a person is imaginative.
Emotionally Aware Measures to what degree a person is conscious of and connected with their feelings and emotions.
Adventurous Measures the degree to which a person enjoys and seeks out adventure.
Conscientiousness Measures the degree to which a person is reliable.
Self-assured Measures how much confidence a person has in themselves.
Disciplined Measures a person's propensity to follow routines and rules.
Ambitious Measures the degree to which a person is ambitious or driven by the desire for achievement.
Dutiful Measures a person's sense that they should respect expectations and authority.
Cautious Measures how cautiously a person tends to act.
Organized Measures how organized and orderly a person is.
Extraversion Measures the degree to which a person feels energized and uplifted when interacting with others or engaging in activity.
Sociable Measures how much a person seeks out and enjoys social situations.
Friendly Measures how friendly a person generally is and how positive they are when interacting with others.
Assertive Measures how assertive a person is and how comfortable a person is with expressing their ideas and needs.
Energetic Measures how much energy and enthusiasm a person tends to have.
Cheerful Measures how happy and cheerful a person generally acts.
Active Measures how strongly a person feels the need for activity and engagement in their life.
Agreeableness Measures the degree to which a person is inclined to please others.
Generous Measures how much a person enjoys spending their time and money on others.
Trusting Measures how easily a person trusts others.
Cooperative Measures how well a person takes into account the needs of others.
Empathetic Measures how strongly a person internalizes the feelings of others.
Genuine Measures how genuine and honest a person is.
Humble Measures how humble and modest a person is.
Neuroticism Measures the degree to which a person expresses strong negative emotions.
Impulsive Measures how inclined a person is to act impulsively.
Stressed Measures the degree to which a person is experiencing stress and how strongly affected they are by it.
Anxious Measures the degree to which a person is experiencing anxiety and how strongly affected they are by it.
Aggressive Measures the degree to which a person exhibits anger or aggression.
Melancholy Measures how much a person is expressing sadness.
Self-conscious Measures how likely a person is to feel embarrassed or anxious about themselves or their skills.
Social Style Insights:
Social Skills Measures the degree to which a person feels at ease with others and is able to navigate social situations.
Insecure Measures the degree to which a person lacks confidence when dealing with others.
Cold Measures the degree to which a person is emotionally unresponsive and has difficulty empathizing with others.
Family Orientation Measures the degree to which a person̍s values and behaviors are rooted in their sense of family.
Emotional Style Insights:
Adjustment Measures the degree to which a person is grounded, is able to maintain quality relationships with others, and establishes healthy life goals.
Happiness Measures the degree to which a person is optimistic, upbeat, and happy.
Depression Measures the degree to which a person may have difficulty finding joy in their life.
Working Style Insights:
Independent Measures the degree to which a person is a non-conformist.
Power Driven Measures the degree to which a person is driven by the desire for power.
Type-A Measures the degree to which a person is driven and competitive.
Workhorse Measures the degree to which a person has a strong work ethic vs. preference for leisure and non-work activity.
Interests and Orientations:
Friendship Focused Measures the degree to which a person focuses on friends and friendship, and likely spends time thinking about their social connections.
Body Focus Measures the degree to which a person focuses attention on their body or other people's bodies.
Health Oriented Measures the degree to which a person is focused on health, likely spends time thinking about their own health or the health of others.
Sexual Focus Measures the degree to which a person focuses on sexuality, sex-related themes, concepts and ideas.
Food Focus Measures the degree to which a person focuses thoughts on eating or drinking, and likely enjoys discussing food or drinks with others.
Leisure Oriented Measures the degree to which a person thinks about leisure activities such as sports, entertainment, travel, or organized events.
Money Oriented Measures the degree to which a person thinks about money and finances. May be focused on personal finances, the broader economy or both.
Religion Oriented Measures the degree to which a person focuses on religion, and likely spends time discussing religion, religious themes, ideas and topics.
Work Oriented Measures the degree to which a person is focused on, or preoccupied with work or school.
Netspeak Measures the degree to which a person is comfortable communicating with Internet shorthand and instant messaging slang, abbreviated words, acronyms and special characters.
Receptiviti scores are outputted as both percentiles and raw scores:
Percentiles are “Scaled” Receptiviti scores, using z-transformation to convert numbers into a percentile.
Raw scores for each of the Receptiviti measures (e.g., neuroticism, independence, etc.) all occur on different scales based the metrics used to create them. For example, the raw score for openness has a possible range of -2.5 to +7.5, whereas the raw score for neuroticism can range from -5 to +20. The following list shows the hypothetical minimum and maximum for each of the raw Receptiviti scores:
HYPOTHETICAL MINIMUMS AND MAXIMUMS FOR RAW RECEPTIVITI SCORES
7B. LIWC Outputs
Receptiviti API generates 93 LIWC2015 outputs (LIWC2015 is the newest version of LIWC). LIWC2015 output variables are expressed as percentage of total words. There are six exceptions: word count (WC; raw word count), (WPS; mean words per sentence), and four summary variables: Analytic, Clout, Authentic, and Tone. Each of the summary variables are standardized composites based on previously published research. The composites have been converted to percentiles based on large corpora of texts described in the LIWC2015 Language Manual. The summary variables are:
Analytical thinking: A high number reflects formal, logical, and hierarchical thinking; lower numbers reflect more informal, personal, here-and-now, and narrative thinking.
Clout: A high number suggests that the author is speaking from the perspective of high expertise and is confident; low Clout numbers suggest a more tentative, humble, even anxious style.
Authentic: Higher numbers are associated with a more honest, personal, and disclosing text; lower numbers suggest a more guarded, distanced form of discourse.
Tone: A high number is associated with a positivity, while a low number reveals greater anxiety, sadness, or anger. A number around 50 suggests either a lack of emotionality or similar amounts of positive and negative emotions.
View the complete list of LIWC variables (opens new window)
7C. Emotional Analysis Outputs
Emotional tone: Scores range from Negative (1) to Positive (99). A score around 50 is considered Neutral and suggests either a lack of emotionality or similar amounts of positive and negative emotions.
Emotional analysis: Each Emotional Analysis facet score indicates what percentage of the total negative emotion expressed in the language sample corresponds with each primary negative emotion.
7D. LSM Outputs
Typical LSM scores range from .50 to 1.00. The closer the score is to 1.00, the more in synch the two people are. Note: for LSM, the phenomenon of being “in sync” can apply in both positive and negative ways, for example, analysis of language from two people who are quarrelling may also result in a high LSM score.
8. Language Sample Conventions:
Receptiviti does not discriminate between upper- and lower-case letters. It can only count words contained in the LIWC2015 dictionary, and as such it cannot accommodate misspellings, colloquialisms, foreign words, proper names, scientific or technical terms, and many abbreviations.
Spelling, abbreviations, contractions
Receptiviti can accommodate for both United States and British spellings.
Receptiviti can also accommodate for common verb contractions including: don’t, won’t, isn’t, shouldn’t, can’t, couldn’t, I’m, I’ll, I’d, we’re, we’d, you’re, he’s, it’s, etc. Most others will be simply counted as possessive nouns: “Sally’s shoes” will be counted the same way as “Sally’s going to the store.” In the second case, change “Sally’s” to “Sally is.”
Helpful tip: Don’t lose a lot of sleep about spelling. The more files and words you have, the less it matters. If your study has 100 people who write 1,000 words each, even if 1 percent of your words are misspelled, it probably won’t matter. However, consistent misspellings should be corrected.
End of sentence markers and hyphens
The Words per sentence (WPS) category is based on the number of times that end-of- sentence markers are detected. These include all periods (.), question marks, and exclamation points. One potential problem is that common abbreviations (such as “Dr.”, “Ms.”, “U.S.A.”, “D.O.A.”) will be counted as multiple sentences unless the periods are removed. Be careful that the removal of the periods doesn’t make a new word. For example, the United States, or “U.S.”, becomes “US” (1st person plural pronoun) when the periods are removed. In this case, change it to “USA”.
Time markers (e.g., 6 a.m. or 7:30 p.m.) can also be a problem. Because “a.m.” without the periods is a verb, “am”, change time to 6am or 7:30pm.
When words start or end with hyphens, they are read by LIWC2015 as part of the word. LIWC2015, for example, lists “chit-chat” as a meaningful word in one of its dictionaries. In cases of hyphenated phrases such as “this-or-that” LIWC2015 will count it as three separate words since “this-or-that” is not in the dictionary.
Common Internet Notations
Several types of Internet notations can lead to less accurate results. Examples of this include e-mail addresses, hashtags, and web URLs and other links. Like misspellings, the occasional Internet notation here or there will not likely impact your results in any significant way. However, you should consider altering them or removing them from you sample prior to analysis if your text data has a large amount of these types of notation. A few examples are shown below.
Entry Type Entry Example Recommended Replacement
E-mail Address email@example.com subEmailaddress
URL address http://www.receptiviti.com subURLaddress
Hashtag #ReceptivitiBrilliant subHashtag
Twitter Handle @receptiviti subTwittername
Note that Receptiviti accommodates for many types of “netspeak” that is used as shorthand interpersonal communication (e.g., “lol”, “4ever”).
9. Does word use validly reflect people’s psychological states?
Let’s rephrase that: If a person is using a high rate of anger words, are they really angry? This is a tough question to answer directly. It also points to the importance of hundreds of scientific studies that have been conducted since the early 1990s.
There have indeed been several studies that find that when people report themselves as being angry they use more anger-related words. Analyses of speeches, writings, and conversations show that people rate texts that are high in anger words as expressing higher rates of hostility. But is the speaker really angry? Is it possible that she or he is just pretending to be angry? This is a judgment call, and context matters. For example, if you’re analyzing the words of a Wikipedia page on “anger management”, the results likely have little to do with how angry the author was at the time of writing.
10. Does Receptiviti make mistakes in categorizing personality and language? just how precise is it?
Receptiviti, like all text analysis tools, can make errors in identifying and counting individual words, especially words in isolation. Consider the word mad – a word that is counted in the Anger, Negative Emotion, and Overall Affect dictionaries. Usually, mad does reflect anger. Sometimes it expresses joy (he’s mad for her) and mental instability (mad as a hatter). Fortunately, this is seldom a problem because Receptiviti takes advantage of probablistic models of language use. Yes, in a given sentence, the word mad might be used to express positive emotion. However, if the author is expressing a positive state of affairs, they will generally tend to use relatively high rates of other positive emotion words and few anger words. Small classification errors like this rarely impact the conclusions that can be drawn from the results because they are offset by the way that words are most commonly used by people.
Just as individual words may be misclassified, Receptiviti also does not understand irony, sarcasm, or metaphor. Again, it is all probabilistic. If someone is being mean spirited in their use of sarcasm, there is a good chance that Receptiviti will capture hostility in other word choices.
Please be careful in interpreting your Receptiviti output. The more words you analyze, the more trustworthy are the results. A text of 10,000 words yields far more reliable results than one of 100 words. For LIWC analysis, we recommend a minimum of 50 words. For Receptiviti analysis, we recommend a minimum of 300 words.
For both LIWC and Receptiviti analysis, if your language sample is smaller than the recommended minimum word count, be careful in interpreting results.
11. Sample Code:
Click here to see sample API code that’s been contributed by our community of developers to help you get started with Receptiviti faster.