SEO practitioners working on SEO for Baidu often use tactics and principles based on their experience with SEO for Google peppered with opinions and simply watching search engine results. This Baidu Ranking Factors Correlation Study is my contribution to filling some of those gaps in factual, SEO ranking factor data.
Marcus Pentzek, Director SEO, Jademond Digital
Research Insight: Mastering Baidu SEO – Strategies for Achieving Top Rankings on China’s Premier Search Platform
For an in-depth understanding of Baidu’s ranking intricacies, secure your copy of our comprehensive analysis. This essential guide meticulously navigates through the nuances of SEO for those operating within, or aiming to penetrate, the Chinese digital landscape.
The inaugural Baidu SEO Ranking Factors Study, a pioneering collaboration between Marcus Pentzek and Searchmetrics (now a Conductor subsidiary), was published in 2020. Following the acquisition of Searchmetrics and the subsequent removal of the study’s download options, Marcus Pentzek’s Jademond Digital is set to release an updated and enhanced edition by Christmas 2023. This forthcoming study expands its scope to Baidu’s Top 20 rankings, examines an increased array of potential ranking factors—150 compared to the previous 99—and delves deeper by assessing the distinct behaviors of these factors across shorthead, midtail, and longtail keywords where appropriate.
If you have any questions, or if you are a Baidu SEO expert, who wants to contribute to this study, please contact us at firstname.lastname@example.org
Targets of the Baidu Ranking Factors Correlation Study:
Uncover the shared principles underpinning SEO strategies for Baidu and Google.
Access empirical insights with data-driven correlation analyses of ranking factors.
Gain actionable recommendations to enhance your prospects for securing a Page 1 presence on Baidu for your targeted keywords.
Gain a competitive edge by mastering Baidu SEO and outmaneuvering market rivals.
Correlations vs Ranking Factors
The Baidu Ranking Factors Study doesn’t turn every bit of data into a direct SEO action or confirm clear-cut factors that guarantee higher rankings. What it does is show you what the top-ranking websites on Baidu have in common. It gives you a snapshot of what Baidu’s search results look like right now. While it can help clear up some questions and offer tips on what you might want to try to rank higher, it’s not a checklist of surefire ranking boosts. Instead, it’s a deep dive into the trends and patterns that correlate with high rankings.
Differences to the 2020 Study with Searchmetrics
In 2020 we used the keywords from Searchmetrics’ Research Cloud for China. We deleted all keywords containing letters from the alphabet, we deleted all the keywords containing special characters, numbers, Cyrillic, Arabic, and Japanese, … from the leftover all Chinese keywords, we deleted all those with Traditional Chinese characters and all those keywords longer than 8 characters. For the resulting set, we picked randomly 50,000 keywords to make sure we had a keyword set not only catering to high search volume keywords, not only covering one industry, …
In 2023 we took a different approach. First of all, we didn’t have the previous keyword set (as it is a property of Searchmetrics) and secondly, we saw a flaw in our previous strategy of picking the keywords: Although we probably had a quite good mix of search volumes and topics, we probably also had quite a number of keywords that were branded keywords of some kind, and branded keywords usually behave differently in Search than regular keywords “everyone” can (try to) rank for.
This time we looked at our own customer base from various industries, we reached out to China SEO colleagues from other companies and asked for their (anonymized) keyword sets they are trying to rank their clients for (for many different and other industries), we looked to top industry keywords of tools like Dragon Metrics and 5118. We made sure to manually filter out keywords that are branded keywords for some companies, we manually filtered out keywords prohibited in China (e.g. gambling), we filtered out keywords catering to erotic topics, … and came up with a quite large keyword set catering to many different industries and topics, B2B and B2C, informational as well as transactional search queries, from all levels of search volumes and difficulties.
We stripped down the final keyword set to 10,000 keywords, making sure that no one (or few) industry would be more represented than others.
In 2020 we looked at 50K keywords but all calculations were done in my Excel sheets, so I had to find a way to lower the number of data points used for calculations, so I only looked at the top 10 ranking URLs.
In 2023 we take the calculations all to the Python level and only store the calculation results in Excel, so we are able to cope with more data and we looked at the top 20 ranking URLs.
In 2020 I used Excel and the CORREL() function to calculate the correlation score. As my computer regularly broke down doing so looking at all (up to) 500,000 data points in the Excel sheet, I decided to create the MEDIAN values first for each position and then only run the calculations over these 10 median data values. That helped understanding tendencies but did not take into account any fluctuations of the whole data set.
In 2023 our Python scripts had no problems running the Spearman calculation over all (up to) 200,000 values (10K keywords x 20 positions). This is considered a more accurate correlation approach for looking at ranking factors but also leads to less convincing correlation scores, as the fluctuations in the big data set might work against it.
So if you still have the 2020 study document at hand, you might see much clearer correlation values of 0.8 whereas our new study only would return a value of 0.1 while showing a similar graph.
Number of Factors / Charts
In 2020 we looked at 83 different details that might have an impact on Baidu’s search algorithm.
In 2023 we are looking at more than 118 aspects (it is 118 charts, but some minor aspects are combined together in one chart).
Depth of Analysis
In 2020 we analyzed just as one would expect it in such a study – all results together.
In 2023 we quickly noticed that we can find a lot of negative correlations where we would have expected to see positive correlations, so we decided to take a more detailed look at differences between results for shorthead keywords (search volumes of 1000+), midtail (search volumes between 999 and 50) and longtail (search volumes smaller than 50).
When it came to keyword usage in content, we even looked at keyword length, as it is easier to implement a keyword with a length of just 2 characters in an exact match into the content than a keyword with a length of 8 or more characters in length.
We plan to release the study for Christmas this year, as a gift to all SEOs interested in SEO for China. Follow me on LinkedIn or Facebook to be among the first people informed, when it is available.