Skip to content

Vaansh/gorepost

Repository files navigation


Logo

GoRepost

A Content Resharing Engine Written in Go.
Explore the docs »

  1. About The Project
  2. System Design
  3. Overall System Architecture
  4. Deployment
  5. Miscellaneous
  6. License

About The Project

A few years ago I wrote a few scripts to automate posting content from Reddit to Instagram. Since then I've had the idea to capitalize on the rise of short form content – this is an implementation of that idea. I wanted a way to automate posting content from multiple sources – regardless of the platform (YouTube, Instagram, Tiktok, etc.) to another platform of my choice. Something scalable and well thought out based on my previous experience with this project. It's a project I have been meaning to work on from quite some time now and I'm happy with the way it has turned out so far. That being said it is still a prototype and you can track my progress here.

Built With

System Design

A Task makes up the basic structure of my application. It is defined as follows and consists of many publishers and one subscriber. Channels in Go seemed like the right message passing system to start with (it gave me the pub-sub mechanism I was looking for to implement something like this). Every task when created, has its own channel where the publishers write and subscriber consumes – making each component responsible for their own fetching or posting mechanism – allowing for separation of concerns.

type Task struct {
	Id         string
	Publishers []publisher.Publisher
	Subscriber subscriber.Subscriber
	Quit       chan struct{} // covered later
}

It has a Run() method which starts a goroutine for every publisher (that publish posts to the channel) and the subscirber (which subscribes to this channel and is responsible for ensuring: (1) ensuring uniqueness of the post (2), storing the file locally, (3) uploading it to a cloud storage, (4) deleting them from both after posting it to the desired platform, and (5) ensuring it maintains a certain posting frequency). The definition of Publisher and Subscriber are outlined below for reference.

type Publisher interface {
	PublishTo(c chan<- model.Post, quit <-chan struct{})
	GetPublisherId() string
}

type Subscriber interface {
	SubscribeTo(c <-chan model.Post)
	GetSubscriberId() string
}

NOTE: As of now, YouTube is the only kind of publisher and Instagram is the only kind of subscriber that is supported. Please follow the rest of the document with that in mind.

There is also a Quit channel that simply exists for force quitting a task, this is for use by the TaskService. It is the service responsible for managing the lifecycle of all tasks by maintaing a map for the tasks currently running and the quit channel to invoke, if the task should be stopped.

Overall System Architecture

With the main application logic out of the way, I'll cover the entire software architecture from a higher level. Below is the overall workflow of the project in its current state. Each component if briefly talked about below.

Persistence with PostgreSQL

I needed some kind of persistence mechanism to ensure I do not post the same content twice (subscriber side logic) so I decided to go with Google Cloud Platform and created an instance there. I didn't use GORM or any other Object Relational Mapper since it didn't fit my needs and I was focussed on delivering the project ...before my cloud services credits expire.

Cloud Bucket Storage

In order to use Meta's developer API for content publishing (subscriber side logic), I needed the videos to be stored locally (in the data/ directory) and then hosted on a web server before hitting their Graph API endpoint to publish the video. For this reason, I chose cloud bucket storage as it gave me an easy way to manage storage with Go's client library and since I was already using their database services.

Cloud and Local Logging

For app-wide logging, I chose to go with GCP again, since it would give me a centralized way of going over my logs. For development purposes, I usually log them locally (logs get saved into the log/ directory). Moreover, I had plans of containerizing the application, so I needed a good way to monitor or debug it. Local and cloud logging options can be set at an application level through the use of environment variables. In production, I have cloud enabled and local logging disabled.

Web API with Gin

Since I wanted to take as much of a hands-off approach to the project, I decided to build an API around it that would allow me to manage my tasks by interfacing with the TaskService. I used Gin once and was quite happy with the experience so I wrote my handlers and decided to go with it again. Admittedly though, the middleware in its current state is weak (just a token set as an environment variable on the web server that needs to be added in my web requests).

A sample POST request to tasks/:platform

A sample DELETE request to tasks/:platform/:id

Deployment

The only way to make changes to the main branch is by opening PRs. Once a PR is merged, the docker image builds, and if succesful, is moved to the DigitalOcean container registry which serves as the main hosting place for my artifacts. I also publish this as a private image on DockerHub, for my own future reference. The flow described is displayed in the image below.

GitHub Actions

I have two GitHub actions in place:

PR Linting

One for – linting PRs and making sure no weird looking code gets committed to main.

Docker Image

Another one for building and pushing the docker image to the registry, acting as my CI/CD pipeline. It also runs a script that SSH into my droplet, replaces the to the newest docker image in the registry and starts running it.

DigitalOcean Droplet

I tried Google Cloud Run, Kubernetes, & Compute Engine but nothing really suited what I was looking for. I decided to go with another platform and created a Droplet (VM) on DigitalOcean. I really liked my experience with DigitalOcean so far, it gave me a streamlined developer solution I was looking for.

Miscellaneous

There are certain things I wanted to discuss but didn't fit into any of these topics, so I'm briefly going over them above.

Future Plans

The project page is the best way to track future plans and current progress of the project.

Overall, I need to add unit tests (and run it on PRs), and I think I can to a better job with logging. But there are still a lot of other platforms I need to implement – for both publishers and subscribers.

My application is also quite stateful, which might be something I would want to look into if I want to redesign it before moving forward with the project. I still like it because it helped me learn bridge some knowledge gaps I had in Go.

Environment Variables

All required environment variables so far can be seen in the .env-sample file. (PS: having the actual file is not necessary, I just use it so I have my secrets in one place but you can just set those environment variables directly). For obvious reasons, the actual variables file itself isn't tracked and I generate it throguh GitHub actions – this way I don't have to set them each time I change the machine I deploy them on. But it also means my docker image must remain priavte so I ensure no one has access to my containers. Also there is a sample service account key credentials JSON file that is needed for accessing various GCP services through client libraries.

License

Distributed under the MIT License. See LICENSE for more information.

About

A content resharing engine written in Go.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published