Building a search for my Twitter account 🔍 with Instantsearch.js

Hey guys :wave:

I’ve been working this weekend on a little hack for myself, and specially how I use my own curation as a searchable database of knowledge.

I just released an article on Medium explaining the whole concept step by step and where I struggle : https://medium.com/@antoineplu/building-a-search-for-my-twitter-account-dbbd0cc2a875#.pf53ow2k1

You’ll see I’m struggling a bit with the way I need to rewrite my data in JSON to be able to scale. If you have any idea, I’m listening! :v:

You can directly try my demo on Codepen and check the code here.

I hope you enjoy it

5 Likes

Hi Antoine,
That’s a cool implementation that you’re building here :slight_smile:

What kind of issues do you have with the data import / rewrite of the data in JSON? Are you importing those using the dashboard upload feature, or programmatically using one of the API clients?

1 Like

Hey @alex ,
Thanks for your comments.

The thing is, I downloaded my Twitter archive of 20K Tweets, and I need to convert that into a JSON.
Today I’m doing the work by hand based on this model for each data, but when I see the time I did for 2 months, I can’t see myself doing it for 6 years unless there is a way to automated that :confounded:

1 Like

@aplu Cool project, thanks for sharing! I definitely have felt this pain before.

I have a possible solution to remove the manual labor step. The twitter archive download also contains a JSON version of all tweets in the data/js/tweets folder. It is one file per month of tweets. The format of each tweet is this:

{
  "source" : "\u003Ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003ETwitter Web Client\u003C\/a\u003E",
  "entities" : {
    "user_mentions" : [ {
      "name" : "Adam DuVander",
      "screen_name" : "adamd",
      "indices" : [ 40, 46 ],
      "id_str" : "818902",
      "id" : 818902
    }, {
      "name" : "Adam DuVander",
      "screen_name" : "adamd",
      "indices" : [ 47, 53 ],
      "id_str" : "818902",
      "id" : 818902
    } ],
    "media" : [ ],
    "hashtags" : [ ],
    "urls" : [ {
      "indices" : [ 54, 77 ],
      "url" : "https:\/\/t.co\/QYFgeHLxNe",
      "expanded_url" : "https:\/\/zapier.com\/engineering\/great-documentation-examples\/",
      "display_url" : "zapier.com\/engineering\/gr\u2026"
    } ]
  },
  "geo" : { },
  "id_str" : "827224893552881665",
  "text" : "8 great doc examples in 1 great post by @adamd\n@adamd https:\/\/t.co\/QYFgeHLxNe",
  "id" : 827224893552881665,
  "created_at" : "2017-02-02 18:39:18 +0000",
  "user" : {
    "name" : "Josh Dzielak \uD83D\uDD0E\uD83D\uDC99",
    "screen_name" : "dzello",
    "protected" : false,
    "id_str" : "45297280",
    "profile_image_url_https" : "https:\/\/pbs.twimg.com\/profile_images\/775481295279300608\/n8xLP5iX_normal.jpg",
    "id" : 45297280,
    "verified" : false
  }
}

It has more fields than you need but you could filter them out before you index them.

What you need to do is iterate all the files in the tweets directory, parse them into objects, transform the objects how you like, and then upload them to Algolia. You may need to delete this line from each file first:

Grailbird.data.tweets_2017_02 =

I don’t know why it’s there but it will break the JSON parsing unless you remove it.

2 Likes