Pythian Blog: Technical Track

Minimal Twitter to Google Pub/Sub example with Scala

Recently I was looking for a simple Twitter to Pub/Sub streaming pipeline and ended up with own implementation in Scala. I tried to make it as compact as possible. So I chose the dispatch and Google Pub/Sub client libraries for Java. You should have a Google Cloud Platform service account key and Twitter API consumer key and tokens ready to start. 1. Create publisher: [code language="scala"] val publisher = Publisher. newBuilder(TopicName.of("projectId", "topic")). setCredentialsProvider(FixedCredentialsProvider.create( GoogleCredentials.fromStream(new FileInputStream("key.json")))). build()[/code] 2. Get stream of statuses/sample messages and publish them! [code language="scala"] Http.default( url("https://stream.twitter.com/1.1/statuses/sample.json") <@ ( new ConsumerKey("consumerKey", "consumerSecret"), new RequestToken("accessToken", "accessTokenSecret")) > as.stream.Lines { tweet => publisher.publish(PubsubMessage.newBuilder.setData(ByteString.copyFromUtf8(tweet)).build()) })[/code] As improvements you may want to configure BatchingSettings settings for Publisher and add various exception handlers. You can find full source code here.

No Comments Yet

Let us know what you think

Subscribe by email