Suyati’s session at Dreamforce 2016- Major extracts
Suyati feels extremely proud to be presenting this year at the world’s largest Cloud Computing conference Dreamforce. Our Salesforce evangelists Arunraj Pazherickal and Abhishek Sivasubramanian were selected to speak at Dreamforce 2016 . Their topic was “SOQL Performance Tuning and the Multi-Tenant Query Optimizer”
The developer community often faces issues around performance and most often it is not due to voluminous aspects in apex code, instead it is because the developers don’t give much attention to crafting the SOQL (Salesforce Object Query Language) statements. May be developers tend to take this in a much lighter way since it’s easy to write a SOQL statement.
The session showed the audience the types of issues that a SOQL statement can introduce resulting in performance issues in the Force.com platform.
Following are some of the most common performance issues faced by the developer community:
- Salesforce Reporting: You should have come across situations where your Salesforce users say that some reports are taking long time to render the data. Even after adjusting the report filters, it still doesn’t make any considerable improvement.
- Apex Logic: As said earlier, developers might overlook the simplicity of SOQL and write some SOQL that does return data in sandboxes with lower data count. But the same SOQL sporadically time outs when you deploy the code base to production (mainly because the tons of data are not queried properly by the Force.com platform. This happens because the SOQL queries were not checked for performance attributes).
- Visualforce: We all know that Visualforce pages are user facing pages and they definitely need to be performant. The Visualforce controllers (apex logic) which query and feed data into the page might be using SOQL and these queries can also affect execution time if they are not checked for performance.
- List Views: Almost everyone uses list views on the Tabs in Salesforce. These list views help us to see a subset of data that meets some kind of criteria. Sometimes users create list views with filter criteria that pull loads of data. These filters act like SOQL statement in apex and if not checked for performance issues, it can make your life hell.
Considering all the above situations, we definitely can understand that SOQL is a very crucial part in every aspect of Salesforce Cloud. Next question that arises is what engine in the Force.com Platform is executing these SOQL statements and how can we make that engine perform well in returning the results.
Force.com Query Optimizer is the primary command center in Force.com platform that is responsible for executing the SOQL/Filter statements and returning the data back to the code/Front-end. Force.com Query Optimizer is like a black box to whoever is reading this sentence. So, let’s familiarize the building blocks inside Force.com Query Optimizer.
Force.com Query Optimizer is composed of 3 simple building blocks.
1. Pre Query Block: The Pre-Query block is responsible for doing some background checks on what all things the Query Optimizer needs to know prior to running the SOQL statement. It checks for 2 major aspects.
a.User Visibility: If user A is initiating a SOQL query, then with respect to Organization Wide Defaults, Profiles, Permission Sets and Sharing Rules, out of N number of records in the system, how much can user A actually see/access. Is it all of the “N” records or a subset of “N”.
b.Filter Selectivity: Here, SOQL query is checked for the default filter conditions that are present in the WHERE clause part of the SOQL query and judges if any additional refinement is necessary or not. It checks if the WHERE clause part uses any standard index/custom index fields.
2. Selective/Optimal Filter Block: Selective Filter is a field + condition that makes a query deterministic. This block takes the results obtained from the Pre Query block and then tries to refine the WHERE clause part of the SOQL query. Assume that user A initiates a SOQL query, the Pre Query block inspects the WHERE clause part of this SOQL query and tries to refine/tune it up so that the Force.com Query Optimizer will be able to ensure that selectivity is always attained. This means that if your org has 1 million records, Force.com Query Optimizer will be very unhappy if you provide a SOQL query that returns the full 1 million records. If you provide a SOQL query that selects 10 particular records of your interest from the whole 1 million records in the system, then the Force.com Query Optimizer will be very happy because selectivity is attained in this case. The Selective/Optimal Filter block has its own components like:
a. Field Filter: Field Filter is a field that determines a subset of data that can be qualified. This means that the field that is used in the WHERE clause of the SOQL query should be able to select records in an efficient manner.
Selective Examples: Date fields are selective in nature because Date is often mentioned as a particular date value or between ranges of date. Similarly, Picklist fields are also selective in nature because the picklist would have almost 10 or 15 options around which the data lies. This means that we know the range of the data if a picklist field is used. This is why we call this component as a Field Filter (a field that acts like a filter).
Non Selective Examples: Text type fields/Numeric/Currency fields are non-selective in nature because text/numeric/currency fields can vary infinitely and it’s very difficult to rely on these kind of fields whose range cannot be predicted.
b. Operator: Operator is a logical condition that can quickly select data.
Selective Examples: “=” equal to operator is by default selective because you can specify a particular value (like a=2). STARTS WITH is also selective in nature because you can specify it using LIKE and Trailing % wildcard (City LIKE ‘New%’).
Non Selective Examples: CONTAINS is non-selective in nature because optimizer has to check the occurrence of a value in main text (at start, at end, anywhere in middle). This becomes an overhead for the Query Optimizer. Use of ! or NOT operator is also non-selective in nature because negative conditions introduce negativity/uncertainty.
c. Index: Index is a field that uniquely identifies a record/tuple. There are 2 types of indexes namely:
- Standard Index: Standard Index is a standard field provided by the platform that uniquely identifies a record/tuple. Primary Keys (Id), Foreign Keys (Lookup and Master-Detail relationship fields) and Audit fields (CreatedDate, SystemModstamp) etc are considered as Standard Index fields.
- Custom Index: Custom Index is a custom field set by us that uniquely identifies a record/tuple. It basically depends on your business domain. Some common examples are Phone, Employee Identification Number, Passport Number, etc. We are the ones who should find custom fields that are eligible to become Custom Index in Salesforce Org. For this, we can use “Query Plan” tool in Salesforce. We will be talking about Query Plan in detail in the upcoming sections.
d. Thresholds: Threshold is the maximum number of records that the Query Optimizer can return for further filtering. Threshold considerations are done for:
1. Thresholds for Standard Indexes
Standard Indexes have greater thresholds of 30% of the 1 to 1 million records + 15% of the next 1 million+ records which together form the Final Threshold. This means that if you use standard indexes in your SOQL query, you will be getting an advantage of 30%-15% records from the platform on which the SOQL filtering would be done. As a result, you get the best of the best records that satisfy your filter condition.
2. Thresholds for Custom Indexes
Custom Indexes have comparatively lesser threshold of 10% of the 1 to 1 million records + 5% of the next 1 million+ records which together form the Final Threshold. This means that you need not force yourself to use standard indexes every time in your SOQL, instead you can depend on a custom field which can uniquely identify records in your custom object (which is called as custom index) with threshold benefits to get the best of the best records that satisfy your filter condition. You might be wondering how to make out which custom field is eligible to become a custom index. For this, you need to use Query Plan tool (which we will discuss later in this blog) and keep monitoring the inflow of record data and judiciously decide with facts that a custom field in my custom/standard object (like Passport Number custom field on Contact object) is always unique in the org.
Now, let’s discuss about Query Plan (which we said is a way to figure out if we are keeping the Force.com Query Optimizer happy).
Query Plan tool can be accessed from Developer Console in Salesforce. Open the Developer Console, navigate to Help à Preferences. In the pop up window, tick “Enable Query Plan” and hit Save.
At the bottom pane, you will be able to see a new button named “Query Plan”. Great !!
Now, type a simple SOQL query in the “Query Editor” and hit the “Query Plan” button. A window will pop up as shown below with lots of statistics taken directly from the Force.com Query Optimizer engine.
What is a Query Plan?
Query Plan refers to a set of disciplined and ordered steps used to access data in the DBMS.Query Plan is an ordered set of steps used to access data in the DBMS. In the figure below, if we want to navigate from point A to point B, mathematically there are “n” number of routes to achieve this task. You can choose Route 1 or Route 2 or Route 3 to reach point B. But we can easily make out that the shortest/efficient path to reach point B is the Route 2.
In Force.com Query Optimizer terminology, we call each Route as Query Plans. This means that Force.com Query Optimizer can query and return results using many Query Plans but one among them will be the most efficient Query Plan.
When we tried to see the Query Plan for a sample SOQL, we did see a window popping up with many columns and some row data.
Let’s understand what each of these columns mean.
- Leading Operation Type – Primary Operation type used by Salesforce to optimize the query. This means that if a SOQL query has Id=1234 in the WHERE clause, the SOQL query results would be driven by the Id field and its value. So, we can say that “Id” which is a standard index is the Leading Operation Type for the Query Plan.
- Cardinality – Refers to the estimated number of records that will be returned by the leading operation type. This means that if a SOQL query has Id=1234 in the WHERE clause, then Cardinality would return 1 because the estimated number of records for a SOQL query with Id equal to an absolute value would always return 1.
- sObject Cardinality – approximate number of records for the query object. sObject Cardinality gives us the estimated number of records returned by the SOQL query if the WHERE clause part was omitted. So, the value would represent how many records we have on the sObject table.
- Cost – The cost of the query compared to Force.com query optimizers’ selectivity threshold. (<=1 means selective, >1 means non-selective)
So, the aim for us is to use the Query Plan and make sure that refinements are done to the SOQL to keep the cost less than 1. To refine the SOQL and reduce the cost less than 1, we need to follow the recommendations mentioned above like to include Field Filters, avoiding negative operators, use the advantages and standard and custom index thresholds, etc. This will ensure that the Force.com Query Optimizer is kept happy. If kept happy, we get the results faster and accurate.
This session video has a demo on how we can convert bad SOQL to good SOQL. Stay tuned to see the complete session video which will soon be updated in this blog.
To know more about Suyati’s Salesforce expertise, please send an email to email@example.com.