Using the Help Center API to manage article translations

If you plan on handing off help center articles for translation on a regular basis, you can automate the process by using the Help Center API to download updated articles to send to your localization vendor. You can also use the API to upload the translations returned by the localization vendor.

Disclaimer: Zendesk provides this article for instructional purposes only. Zendesk does not support or guarantee the code. Zendesk also can't provide support for third-party technologies such as Python or its libraries.

Understanding article updates

This article describes how to download articles that were updated after a start time or between a date range to send to translation. Both methods rely on the article's updated_at property. The basic logic is to check when each article was last updated to identify articles that need to be translated or retranslated.

However, when an article was last updated and when it was last edited might be different. The updated_at property not only records when the article was last edited, it also records when other article updates were made, such as when somebody added a comment to the article. As a result, some articles might not need to be translated even if they were "updated".

For the purpose of sending articles to translation, an article's edited_at time is a better measure. The edited_at time reflects not only when the body or title was last updated and published. It also reflects when the article was created if it hasn't been edited yet.

As a result, this article also describes how to filter downloaded articles by their edited_at property.

Option 1 - Downloading a list of articles

The first option for downloading articles is to compile of list of articles and then download each one with the API. For example, when contributors update an article, you can have them add the article to a list in a shared worksheet or doc. You can also build a small app that saves the information in a lightweight database such as SQLite. This is the approach the Zendesk Docs team uses to download articles for handoffs.

Use the Show Article endpoint to download each article. The endpoint downloads a specific article by id and looks like this:

https://{subdomain}.zendesk.com/api/v2/help_center/articles/{article_id}.json

Write a script or a function that loops, or iterates, through your list of articles and makes a request to the Show Article endpoint for each article.

Example

The following Python script iterates over a list of article ids and downloads each article. The example assumes the Zendesk account's subdomain is named "example" (example.zendesk.com).

import jsonimport requests

handoff_articles = []article_list = [2342654, 2432643]
for article_id in article_list:    url = f'https://example.zendesk.com/api/v2/help_center/articles/{article_id}.json'    response = requests.get(url).json()    article_content = {        'title': response['article']['title'],         'body': response['article']['body'],         'id': article_id    }    handoff_articles.append(article_content)
with open('handoff_articles.json', mode='w', encoding='utf-8') as f:    json.dump(handoff_articles, f, sort_keys=True, indent=2)

How it works

The script starts by importing a few libraries. The json library is a native Python library but the requests library is installed separately. The Requests library makes working with HTTP requests easier. See the installation instructions.

The script sets up a for loop to iterate over the article ids. It makes an API request to the Show Article endpoint with each id:

article_list = [2342654, 2432643]
for article_id in article_list:    url = f'https://example.zendesk.com/api/v2/help_center/articles/{article_id}.json'    response = requests.get(url).json()

The API returns JSON data. The json() method converts the returned data into a Python data type to make it easier to work with. In this case, the data is converted into a dictionary, the Python equivalent of a JavaScript object literal.

Each article record returned by the API contains all kinds of data about the article. For translation purposes, you only need the article's title and body. The script packages the title and body of each article and appends the package to the list of handoff articles:

article_content = {        'title': response['article']['title'],         'body': response['article']['body'],         'id': article_id    }    handoff_articles.append(article_content)

The script also includes the article id so that you can upload the translations to the article later. See Uploading a deliverable.

At the end of the loop, the handoff_articles variable contains the articles to send for translation. The script saves the downloaded articles to a JSON file that you can hand off to your localization vendor.

Option 2 - Downloading articles edited after a start time

The second option for downloading articles for translation is to get all the articles that were edited after a specific start time. This differs from the date-range option in that you don't specify an end time. You export all the articles that were edited from the start time to the present time. This approach is also called an incremental export.

You can use the incremental export version of the List Articles endpoint to find the articles that were edited after a certain date. The endpoint looks like this:

https://{subdomain}.zendesk.com/api/v2/help_center/incremental/articles?start_time={start_time}

where {start_time} is an epoch time like 1618348927. The endpoint returns any article which had any of its metadata change after the start time. This includes changes to metadata other than updated_at or edited_at.

Use the endpoint to incrementally export all articles with metadata changes. For each article, compare the edited_at time with the end time of the previous export. If the article was edited after the end time, add it to your handoff package.

To make sure you don't skip articles or get the same articles again over successive exports, use the end time of the previous export as the start time for the next export. For example, if the end time of your last export was the epoch time of 1617224682 (2021-03-31T21:04:42 UTC), then use that time as the start time for your next export.

Example

The following Python script downloads the articles that were edited after the previous export's end time of 1618348927 (2021-04-13T21:22:07 UTC). The example assumes the Zendesk account's subdomain is named "example" (example.zendesk.com).

import jsonimport requestsimport arrow

handoff_articles = []auth = ('your_email', 'your_password')
end_time = 1618348927           # 2021-04-13T21:22:07 UTCstart_time = end_time
print(f'- getting the articles edited since {end_time}')edited_articles = []url = f'https://example.zendesk.com/api/v2/help_center/incremental/articles.json'while url:    params = {'start_time': start_time}    response = requests.get(url, params=params, auth=auth).json()
    for article in response['articles']:        if arrow.get(article['edited_at']).timestamp < end_time:            continue        edited_articles.append(article)
    if response['next_page']:        start_time = response['end_time']    else:        end_time = response['end_time']        url = None
print('- packaging and saving the content to handoff_articles.json')for article in edited_articles:    article_content = {        'title': article['title'],        'body': article['body'],        'id': article['id']    }    handoff_articles.append(article_content)
with open('handoff_articles.json', mode='w', encoding='utf-8') as f:    json.dump(handoff_articles, f, sort_keys=True, indent=2)
print(f'- use the following end time value in the next export: {end_time}')

How it works

The script starts by importing a few libraries. The Requests library makes working with API requests simpler and the Arrow library makes working with dates simpler. The json library is a native Python library but you need to install the requests and arrow libraries separately. See:

Next, the script declares a variable to store the exported articles.

handoff_articles = []

The next line specifies your Zendesk credentials. You must be an agent or admin to use the incremental article export endpoint.

auth = ('your_email', 'your_password')

To make sure articles aren't skipped or exported twice over successive exports, the script sets the start time of the export using the end time of the previous export:

end_time = 1618348927           # 2021-04-13T14:22:07 (UTC-7)start_time = end_time

If this is your first article export, choose a date and time, plug it into the epoch time converter, and use that as the end_time.

Next, the script gets all the articles that were edited after the previous export's end time.

print(f'- getting the articles edited since {end_time}')edited_articles = []url = f'https://support.zendesk.com/api/v2/help_center/incremental/articles.json'while url:    params = {'start_time': start_time}    response = requests.get(url, params=params, auth=auth).json()
    for article in response['articles']:        if arrow.get(article['edited_at']).timestamp < end_time:            continue        edited_articles.append(article)

Note: The response will list only the articles that the requesting agent can view in the help center.

The script checks and skips (or continues) to the next article in the response if an article's edited_at time is earlier than the end time of the previous export. The edited_at time is converted to epoch time for comparison with the arrow library's timestamp property. If the article passes the test, it means it was edited after the previous export and the script appends it to the edited_articles list.

You could also further define the scope of the export by specifying help center sections that aren't translated and skipping articles in those sections. Example:

for article in response['articles']:            if article['section_id'] in [3532453456, 4345322342, 5633456]:                continue            ...

Next, the script paginates through the results:

while url:    ...
    if response['next_page']:        start_time = response['end_time']    else:        end_time = response['end_time']        url = None

The pagination method of the incremental List Articles endpoint differs in two important ways from the offset pagination method of the Help Center API. First, the endpoint returns up to 1,000 articles per page. Second, the next_page url doesn't specify a next page number (page=2). Instead, it specifies a new start time, which is the time of the last change of the last item on the current page. The response also records this time as the end_time of the current page.

For example, if the request's initial start_time is 1617731790, the response may contain the following next_page and end_time values:

"next_page": "https://support.zendesk.com/hc/api/v2/incremental/articles.json?start_time=1617733880","end_time": 1617733880,...

Note that the start time of next_page is different than the initial start time and is equal to the page's end_time.

After reaching the last page of results (when next_page is None), the script immediately records the end time to use for the next export:

end_time = response['end_time']

Next, the script packages the translatable content of each article and adds the article to the list of handoff articles:

for article in edited_articles:    article_content = {        'title': article['title'],         'body': article['body'],         'id': article['id']    }    handoff_articles.append(article_content)

At this point, the handoff_articles variable contains all the articles to send for translation. The script saves the articles to a JSON file named handoff_articles.json that you can hand off to your localization vendor.

The script finishes up by telling the user what value to use for the end_time variable in the next export:

print(f'- use the following end time value in the next export: {end_time}')

You could also save the end_time value to a file and add some code at the top of the script to read the file in subsequent exports.

Option 3 - Downloading articles edited between a date range

The third option for downloading articles is to get all the articles that were edited in a specified date range. You can use the Search Articles endpoint to search for these articles. The endpoint looks like this:

https://{subdomain}.zendesk.com/api/v2/help_center/articles/search.json?{search_params}

For example, to search for articles in a specific category that were updated between 2021-03-15 and 2021-03-31, you'd make the following API request:

https://example.zendesk.com/api/v2/help_center/articles/search.json?category=21345&updated_after=2021-03-14&updated_before=2021-04-01

In the example, the updated_after date is actually one day before the start date of 3/15 and the updated_before date is one day after the end date of 3/31. This captures all the articles that were updated in the 16-day period between 3/15 and 3/31 inclusive.

When the next handoff rolls around, make sure you don't miss articles or download the same articles again by using the end date of the previous date range as the updated_after value of the next download. For example, to search for articles updated between 2021-04-01 and 2021-04-15, you'd use the 3/31 end date of the previous date range as the updated_after date in the next download:

https://example.zendesk.com/api/v2/help_center/articles/search?category=21345&updated_after=2021-03-31&updated_before=2021-04-16

Example

The following Python script downloads articles that were edited between 2021-03-15 and 2021-03-31. The example assumes the Zendesk account's subdomain is named "example" (example.zendesk.com).

import jsonimport requestsimport arrow

handoff_articles = []category_ids = ['36000171893', '36000171913', '36000175874']
updated_after = '2021-03-14'updated_before = '2021-04-01'
print('- getting the articles edited during the date range')edited_articles = []url = 'https://example.zendesk.com/api/v2/help_center/articles/search.json'for category_id in category_ids:    print(f'  - searching {category_id}')    params = {        'category': category_id,        'updated_after': updated_after,        'updated_before': updated_before    }    while url:        response = requests.get(url, params=params).json()        for article in response['results']:            if arrow.get(article['edited_at']) < arrow.get(updated_after).shift(days=1):                continue            edited_articles.append(article)        url = response['next_page']
print('- packaging and saving the content to handoff_articles.json')for article in edited_articles:    article_content = {        'title': article['title'],        'body': article['body'],        'id': article['id']    }    handoff_articles.append(article_content)
with open('handoff_articles.json', mode='w', encoding='utf-8') as f:    json.dump(handoff_articles, f, sort_keys=True, indent=2)

How it works

Like the second option, the script starts by importing a few libraries. The Requests library makes working with API requests simpler and the Arrow library makes working with dates simpler. The json library is a native Python library but you need to install the requests and arrow libraries separately. See:

Next, the script declares a variable to store the exported articles and another variable that specifies the categories that contain articles to be translated.

handoff_articles = []category_ids = ['36000171893', '36000171913', '36000175874']

You can make the scope of the search as wide or as narrow as you want. If you translate all the content in your help center, you could use the List Categories endpoint to get the ids of all the categories. If you only translate the content in a few sections, you could specify a list of section ids and search those sections only.

Specifying categories or sections is required. The Search Articles endpoint requires you to specify at least one of the following parameters for each search:

query
category
section
label_names

Because we're searching by date range, not by search term (query) or by label, we have to choose either category and section.

Next, the script sets the date range:

updated_after = '2021-03-14'updated_before = '2021-04-01'

Next, the script performs the search in each category. Because cursor-based pagination is not supported with the Search Articles endpoint, the script uses offset pagination.

edited_articles = []url = 'https://example.zendesk.com/api/v2/help_center/articles/search.json'for category_id in category_ids:    print(f'  - searching {category_id}')    params = {        'category': category_id,        'updated_after': updated_after,        'updated_before': updated_before    }    while url:        response = requests.get(url, params=params).json()        for article in response['results']:            if arrow.get(article['edited_at']) < arrow.get(updated_after).shift(days=1):                continue            edited_articles.append(article)        url = response['next_page']

The script checks and skips (or continues) to the next article in the response if an article's edited_at date is earlier than the search's updated_after date -- in other words, if the article was last edited sometime before the search date range.

The test is necessary because an article's updated_at property not only records when the article was last edited, it also records other article updates such as when somebody added a comment to the article. As a result, an article might not need to be translated.

A full day is added to the updated_after value (arrow.get(updated_after).shift(days=1)) because arrow considers the exact time of updated_at as the very beginning of the day (2021-03-14T00:00:00+00:00). Without the added day, the script would return any articles edited during that day, which is outside the search date range.

Next, the script packages the translatable content of each article and adds the article to the list of handoff articles:

for article in edited_articles:    article_content = {        'title': article['title'],         'body': article['body'],         'id': article['id']    }    handoff_articles.append(article_content)

At this point, the handoff_articles variable contains the articles to send for translation. The script saves the downloaded articles to a JSON file that you can hand off to your localization vendor.

Uploading a deliverable

After a while, your localization vendor will send the translations back to you. Before you start, sit down with the vendor and discuss the format and structure of the deliverable. For example, you could agree with the vendor that the deliverable should be a JSON file with the following structure:

{  "fr": [    {"body": "Un, deux, trois", "title": "Soleil", "id": 21},    {"body": "Quatre, cinq, six", "title": "Lune", "id": 24},    {"body": "Sept, huit, neuf", "title": "Etoile", "id": 39}  ],  "es": [    {"body": "Uno, dos, tres", "title": "Sol", "id": 21},    {"body": "Cuatro, cinco, seis", "title": "Luna", "id": 24},    {"body": "Siete, ocho, nueve", "title": "Estrella", "id": 39}  ]}

Each locale lists the articles translated into that language. With this information, you can write a script to read the translations and then use the Help Center API to upload them to your help center.

The following example uploads the translations from a JSON file deliverable structured like the one above. The example assumes the Zendesk account's subdomain is named "example" (example.zendesk.com). See the print statements and code comments in the example to learn how it works:

import json
import requests

auth = ('your_email', 'your_password')
print('- reading deliverable file')with open('2021-04-15_deliverable.json', mode='r', encoding='utf-8') as f:    deliverable = json.load(f)
for locale in deliverable:    print(f'\n- uploading {locale} translations')    for article in deliverable[locale]:        article_id = article['id']
        # get missing translations to determine if should use a PUT or POST request        url = f'https://example.zendesk.com/api/v2/help_center/articles/{article_id}/translations/missing.json'        response = requests.get(url, auth=auth).json()        missing_translations = response['locales']
        if locale in missing_translations:            # new translation -> do a POST request            print(f'  - posting translation for article {article_id}')            post_url = f'https://example.zendesk.com/api/v2/help_center/articles/{article_id}/translations.json'            data = {'translation': {'locale': locale, 'title': article['title'], 'body': article['body']}}            response = requests.post(post_url, json=data, auth=auth)
        else:            # existing translation -> do a PUT request            print(f'  - putting translation for article {article_id}')            put_url = f'https://example.zendesk.com/api/v2/help_center/articles/{article_id}/translations/{locale}.json'            data = {'translation': {'title': article['title'], 'body': article['body']}}            response = requests.put(put_url, json=data, auth=auth)

Next steps

This article is for instructional purposes only. Its code is not meant to be used in a production environment. However, you can copy the code and modify or extend it to fit your requirements as well as to learn more as you go. For example, you could bundle the code into a function to use in a larger application. Example:

def export_edited_articles(last_exported, category_ids)    handoff_articles = []    start_time = last_exported    ...