When it comes to customer relationship management software, Salesforce is the undisputed king, so it is no surprise that there are many folks looking to integrate Salesforce data with other sources for a deeper look into performance.

Let’s take a look at the types of queries we can run on Salesforce data and how to integrate Salesforce into your database.

Using SQL To Analyze Salesforce Data

One of the most important sets of metrics for a sales organization is the sales pipeline funnel. Knowing the conversion rates at each stage of the sales pipeline helps your forecast revenue numbers and figure out where you need to improve.

We have a field new_value in our _lead_history table that lets us count the leads by stage, and we can use the user’s team to track across market segments. By dividing by the total leads, we can look at the conversion rates over the last two weeks:

with total as (select count(1) from salesforce._lead_history)
    when new_value ilike '%assigned%' then 'Assigned' 
    when new_value ilike '%attempt%' then 'Contacted' 
    when new_value ilike '%qualified%' then 'Qualified' 
    when new_value ilike '%won%' then 'Won' end
  , _user.team
  , count(1) / total
    join salesforce._user on 
    _lead_history.created_by_id = _user.id
  field = 'Status'
  and created_date >= 
    date_trunc('day', getdate() - interval '14 day')::date
group by
  1, 2

Running this query gives us back triplets with the stage, team name, and count, like (Contacted, Mid-Market, 0.363). We can turn this into a visualization of our pipeline across market segments:

Market segments funnel

Integrating Other Data Sources

An advantage to downloading your data is being able to join with data sources outside of Salesforce. Users provide a billing address and are assigned a latitude and longitude, but some users cannot be geolocated.

We can use their zip code in this case to join with our standard zip code map to include these users in geographical analysis.

  latitude, longitude
  salesforce._account join zip_codes on 
    left(billing_postal_code, 5) = zip_codes.zip_code
  billing_latitude is null
  and billing_postal_code is not null
union select
    billing_latitude, billing_longitude
    billing_latitude is null

This gives us the full picture of our U.S. users locations:

Account geolocation

Another important growth metric for a company is the ability to grow existing relationships. We can cross-reference our internal usage statistics to find customers ready to move to the next upgrade tier.

  opportunities.company as Client
  , date_trunc('week', (created_date)) as Start_Date
  , (client.requests - client.paid_requests) as Overage
  , user.name as Rep
  join users on
    opportunities.owner_id = users.id
  join client on 
    client.site = opportunities.company

To export data from Salesforce, we will use a connected app. From the setup menu in Salesforce, we’ll first navigate to Build -> Apps:

We can use the result to grow our relationship with existing customers:

Farming dashboard

Extracting Data: Setting Up A Connected App In Salesforce


And create a new connected app:

No apps found

We input a name, email address, and check the “Enable OAuth Settings” box. We need API access to the data and a valid callback URL — we won’t actually be calling it, but it should be a properly formatted URL.

This will give us a Consumer Secret and Consumer Key which we’ll use in conjunction without login information to access the data.


You’ll use the Consumer Secret and Consumer Key with the Rest API to extract the schema, and your Username and Password to access the Bulk API. There are many Salesforce API integrations that will help you pull the data.

One of our preferred options is Heroku’s Salesforce REST API ruby gem to use Javier Julio’s Salesforce Bulk API gem to access the Salesforce APIs.

Configuring Salesforce Data For Your Database

Before loading the data into your database, you’ll want to convert the column types to match your database and reformat the schema to mesh well with your table and column names. For postgres and redshift databases, we use this conversion table from Salesforce type to postgres type.

Salesforce     Postgres
id             varchar(255)
reference      varchar(255)
phone          varchar(255)
boolean        boolean
double         double precision
currency       double precision
datetime       timestamp
date           date
int            integer

And defer to varchar for any other types.

The Salesforce schema naming conventions might need some normalization to match your internal conventions.

At a minimum, we recommend replacing double underscores with a single underscore, removing non-alphanumeric characters, and stripping __c from the end of tables, which Salesforce uses to mark custom tables.

Now you can upload the data into your database and start running queries! We hope this post is helpful while integrating Salesforce into your data analytics stack, let us know what steps you took to connect with Salesforce.

Thanks to Megan Lin and Melody Chan for their help brainstorming this blog post.