SECRET OF CSS

Index your Dropbox content using the Dropbox connector for Amazon Kendra


Amazon Kendra is a highly accurate and simple-to-use intelligent search service powered by machine learning (ML). Amazon Kendra offers a suite of data source connectors to simplify the process of ingesting and indexing your content, wherever it resides.

Valuable data in organizations is stored in both structured and unstructured repositories. An enterprise search solution should be able to pull together data across several structured and unstructured repositories to index and search on.

One such data repository is Dropbox. Enterprise users use Dropbox to upload, transfer, and store documents to the cloud. Along with the ability to store documents, Dropbox offers Dropbox Paper, a coediting tool that lets users collaborate and create content in one place. Dropbox Paper can optionally use templates to add structure to documents. In addition to files and paper, Dropbox also allows you to store shortcuts to webpages in your folders.

We’re excited to announce that you can now use the Amazon Kendra connector for Dropbox to search information stored in your Dropbox account. In this post, we show how to index information stored in Dropbox and use the Amazon Kendra intelligent search function. In addition, Amazon Kendra’s ML powered intelligent search can accurately find information from unstructured documents having natural language narrative content, for which keyword search is not very effective.

Solution overview

With Amazon Kendra, you can configure multiple data sources to provide a central place to search across your document repository. For our solution, we demonstrate how to index a Dropbox repository or folder using the Amazon Kendra connector for Dropbox. The solution consists of the following steps:

  1. Configure an app on Dropbox and get the connection details.
  2. Store the details in AWS Secrets Manager.
  3. Create a Dropbox data source via the Amazon Kendra console.
  4. Index the data in the Dropbox repository.
  5. Run a sample query to get the information.

Prerequisites

To try out the Amazon Kendra connector for Dropbox, you need the following:

Configure a Dropbox app and gather connection details

Before we set up the Dropbox data source, we need a few details about your Dropbox repository. Let’s gather those in advance.

  1. Go to www.dropbox.com/developers.
  2. Choose App console.
    ML 10906 image001
  3. Sign in with your credentials (make sure you’re signing in to an Enterprise account).
    ML 10906 image003
  4. Choose Create app.
    ML 10906 image005
  5. Select Scoped access.
  6. Select Full Dropbox (or the name of the specific folder you want to index).
  7. Enter a name for your app.
  8. Choose Create app.
    ML 10906 image007
    You can see the configuration screen with a set of tabs.
  9. To set up permissions, choose the Permissions tab.
    ML 10906 image009
  10. Select a minimal set of permissions, as shown in the following screenshots.
    ML 10906 image011 ML 10906 image013 ML 10906 image015 ML 10906 image017
  11. Choose Submit.
    ML 10906 image019
    A message appears saying that the permission change was successful.
    ML 10906 image021
  12. On the Settings tab, copy the app key.
  13. Choose Show next to App secret and copy the secret.
  14. Under Generated access token, choose Generate and copy the token.
    ML 10906 image023

Store these values in a safe place—we need to refer to these later.

The session token is valid for up to 4 hours. You have to generate a new session token each time you index the content.

Store Dropbox credentials in Secrets Manager

To store your Dropbox credentials in Secrets Manager, compete the following steps:

  1. On the Secrets Manager console, choose Store a new secret.
  2. Choose Other type of secret.
  3. Create three key-value pairs for appKey, appSecret, and refreshToken and enter the values saved from Dropbox.
  4. Choose Save.
    ML 10906 image025
  5. For Secret name, enter a name (for example, AmazonKendra-dropbox-secret).
  6. Enter an optional description.
  7. Choose Next.
    ML 10906 image027
  8. In the Configure rotation section, keep all settings at their defaults and choose Next.
    ML 10906 image029
  9. On the Review page, choose Store.
    ML 10906 image031

Configure the Amazon Kendra connector for Dropbox

To configure the Amazon Kendra connector, complete the following steps:

  1. On the Amazon Kendra console, choose Create an Index.
    ML 10906 image033
  2. For Index name, enter a name for the index (for example, my-dropbox-index).
  3. Enter an optional description.
  4. For Role name, enter an IAM role name.
  5. Configure optional encryption settings and tags.
  6. Choose Next.
    ML 10906 image035
  7. In the Configure user access control section, leave the settings at their defaults and choose Next.
    ML 10906 image037
  8. For Provisioning editions, select Developer edition.
  9. Choose Create.
    ML 10906 image039
    This creates and propagates the IAM role and then creates the Amazon Kendra index, which can take up to 30 minutes.
  10. Choose Data sources in the navigation pane.
    ML 10906 image041
  11. Under Dropbox, choose Add connector.
    ML 10906 image043
  12. For Data source name, enter a name (for example, my-dropbox-connector).
  13. Enter an optional description.
  14. Choose Next.
    ML 10906 image045
  15. For Type of authentication token, select Access Token (temporary use).
  16. For AWS Secrets Manager secret, choose the secret you created earlier.
  17. For IAM role, choose Create a new role.
  18. For Role name, enter a name (for example, AmazonKendra-dropbox-role).
  19. Choose Next.
    ML 10906 image047
  20. For Select entities or content types, choose your content types.
  21. For Frequency, choose Run on demand.
  22. Choose Next.
    ML 10906 image049
  23. Set any optional field mappings and choose Next.
  24. Choose Review and Create and choose Add data source.
  25. Choose Sync now.
    ML 10906 image051
  26. Wait for the sync to complete.
    ML 10906 image053

Test the solution

Now that you have ingested the content from your Dropbox account into your Amazon Kendra index, you can test some queries.

Go to your index and choose Search indexed content. Enter a sample search query and test out your search results (your query will vary based on the contents of your account).

ML 10906 image055

The Dropbox connector also crawls local identity information from Dropbox. For users, it sets user email id as principal. For groups, it sets group id as principal. To filter search results by users/groups, go to the Search Console.

ML 10906 image064

Click on “Test query with user name or groups” to expand it and click on the button that says “apply user name or groups”.

ML 10906 image065

Enter the user and/or group names and click Apply. Next, enter the search query and hit enter. This brings you a filtered set of results based on your criteria.

ML 10906 image066

Congratulations! You have successfully used Amazon Kendra to surface answers and insights based on the content indexed from your Dropbox account.

Generate permanent tokens for offline access

The instructions in this post walk you through creating, configuring, and using a temporary access token. Apps can also get long-term access by requesting offline access, in which case the app receives a refresh token that can be used to retrieve new short-lived access tokens as needed, without further manual user intervention. You can find more information in the Dropbox OAuth Guide and Dropbox authorization documentation. Use the following steps to create a permanent refresh token (for example to set the sync to trigger on a schedule):

  1. Get the app key and app secret as before.
  2. In a new browser, navigate to https://www.dropbox.com/oauth2/authorize?token_access_type=offline&response_type=code&client_id=<appkey>.
  3. Accept the defaults and choose Submit.
  4. Choose Continue.
    ML 10906 image057
  5. Choose Allow.
    ML 10906 image059
    An access code is generated for you.
  6. Copy the access code.
    ML 10906 image061
    Now you get the refresh token from the access code.
  7. In a terminal window, run the following curl command:
    curl https://api.dropbox.com/oauth2/token -d code=<receivedcode> -d grant_type=authorization_code -u <appkey>:<appsecret>

    ML 10906 image063

You can store this refresh token along with the app key and app secret to configure a permanent token in the data source configuration for Amazon Kendra. Amazon Kendra generates the access token and uses it as needed for access.

Limitations

This solution has the following limitations:

  • File comments are not imported into the index
  • You don’t have the option to add custom metadata for Dropbox
  • Google docs, sheets, and slides need a Google workspace or Google account and are not included

Conclusion

With the Dropbox connector for Amazon Kendra, organizations can tap into the repository of information stored in their account securely using intelligent search powered by Amazon Kendra.

In this post, we introduced you to the basics, but there are many additional features that we didn’t cover. For example:

  • You can enable user-based access control for your Amazon Kendra index and restrict access to users and groups that you configure
  • You can specify allowedUsersColumn and allowedGroupsColumn so you can apply access controls based on users and groups, respectively
  • You can map additional fields to Amazon Kendra index attributes and enable them for faceting, search, and display in the search results
  • You can integrate the Dropbox data source with the Custom Document Enrichment (CDE) capability in Amazon Kendra to perform additional attribute mapping logic and even custom content transformation during ingestion

To learn about these possibilities and more, refer to the Amazon Kendra Developer Guide.


About the author

Ashish LagwankarAshish Lagwankar is a Senior Enterprise Solutions Architect at AWS. His core interests include AI/ML, serverless, and container technologies. Ashish is based in the Boston, MA, area and enjoys reading, outdoors, and spending time with his family.



News Credit

%d bloggers like this: