I'd like to play around with Google Cloud Pub/Sub and processing messages in Dataflow. Are there any public data feeds in Pub/Sub that I can use to get started?
In the Dataflow WordCount example, input is read from a file in Cloud Storage, gs://dataflow-samples/shakespeare/kinglear.txt
. It seems that dataflow-samples
is accessible to all projects, which is very convenient for getting started. Is there anything similar for Pub/Sub?
Currently, Google maintains this public topic projects/pubsub-public-data/topics/taxirides-realtime as part of a Cloud Dataflow code lab.
You can find more information on how to use it here.
Additionally, you can use Dataflow with BigQuery. Google provides this comprehensive set of public data.
What do you mean public datasets in Cloud Pub/Sub? In Cloud Pub/Sub, you have topics, publishers sending messages to those topics and subscribed consumers receiving messages from those topics. Every topic belongs to a project, so as such it doesn't make sense to have a public topic, if that's what you're asking.