Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added filename as a configuration option. #215

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

barqshasbite
Copy link

This new option can be used to specify how files are named when
uploaded to S3. It supports logstash string interpolation (sprintf)
so can be used to generate unique filenames based on the data from
an event.

Addresses #134

This new option can be used to specify how files are named when
uploaded to S3. It supports logstash string interpolation (sprintf)
so can be used to generate unique filenames based on the data from
an event.

Addresses logstash-plugins#134
@Forbzy
Copy link

Forbzy commented Mar 17, 2020

@barqshasbite I was looking for this functionality. I'm pleased to see someone was working on it. It would be very useful to be able to set file names. Can you say what the current conflict issues are?

@BillYoungman
Copy link

I noticed that the last commit for this feature failed on December 19, 2019 has anything been attempted since then. From what I'm seeing I'm not the only one who is interested in this feature.

Thanks,
Bill

@barqshasbite
Copy link
Author

If I remember correctly, the build initially failed for reasons not related to my change (other builds were also failing at the time).

Since then, I have not revisited this change. In its current state, it is working for my limited use case so I did not take it any further. My use-case being uploading single, write-once, JSON files to S3 with a unique name.

I put up the pull request knowing that other people were interested in the feature and may want to use my variation. It is incomplete, though. In that it does not support the size_file and time_file configuration options for rolling over filenames with a partN filename component. It will always upload with the same filename, so could potentially overwrite existing files if you do not have unique naming setup. Adding support for the partN filename component would round this out and make it a more complete feature.

@BillYoungman
Copy link

Our company processes and calculates Sales & Consumer Use Taxes for our clients so on average we get about 10,000 transactions per second that are being moved into Elasticsearch indexes via Filebeat but in addition to this we are moving these transactions in AWS s3 buckets for use in calculating client metric data. The default s3 naming convention of 'ls.s3.xxx' was making it difficult to work with those files hence our need for custom naming so I was really glad when I came across this feature request.

I took your code and after creating a new version of the plug-in in our development environment copied it into that new local version modifying my logstash.yml file to point to this local plug-in. I then made the following modifications to the temporary_file_factory.rb file:

# name = filename == "" ? generate_name : filename name = filename == "" ? generate_name : generate_custom_name(filename)

Created new method:
` def generate_custom_name(filename)
filename = "#{filename}.#{SecureRandom.uuid}.#{current_time}"

      if tags.size > 0
        "#{filename}.tag_#{tags.join('.')}.part#{counter}.#{extension}"
      else
        "#{filename}.part#{counter}.#{extension}"
      end
    end

`
I ran it through extensive testing using JMeter and for us it is working like a charm although I have noticed a couple of things and they are for both the original version and my version of the plug-in neither one are honoring the size_file tag but we think that it is more a case that data is coming in so fast that by the time the rotate / upload is triggered the file size is larger than what is set in the logstash.conf both are honoring the time_file value.

I apologize for the verbosity of this post but I wanted to put some context around what I did and what we saw.

@Forbzy
Copy link

Forbzy commented Jun 24, 2020

It good to hear work on this feature is continuing. @BillYoungman is your version of the plugin still up to date with new changes to the Master branch? For my use case I would need the time_file option to work because I'd need to output data hourly and daily. Does this option work for your version @BillYoungman ?

@BillYoungman
Copy link

@Forbzy it is still in my local file system as this is my first attempt at doing any work like this and wasn't entirely sure of the process involved to officially work in here so didn't want to do anything that might be unauthorized. That being said I do have awhile back I did sign the letter to become a contributing developer.

Let me do some more testing focusing on the time_file property in particular and will post my findings.

@BillYoungman
Copy link

BillYoungman commented Jun 25, 2020

@barqshasbite not sure if this is the correct place for questions if it isn't please direct me to the right place - thanks.
Adding @Forbzy

But here goes--
time_size is working but upon closer testing of just the file_size tag (I was using size_and_time in my earlier testing which was actually masking the size variable) it is not working for the call to generate_custom_name(filename) - it never rolls the files over however when I let the plugin use the default naming method call generate_name it's working fine.

When I run my tests with the default method call I see calls to

if @rotation.rotate?(temp_file)
			@logger.info("Rotate file",
                       :strategy => @rotation.class.name,
                       :key => temp_file.key,
                       :path => temp_file.path)

			upload_file(temp_file)
			factory.rotate!
        end

When I set a custom filename this is not being called at all. So my question is where / how is the default pattern call working but the custom pattern isn't.

Been struggling with this all day and it isn't helping that this is my first foray into Ruby as well.

Thanks,
Bill

@Forbzy
Copy link

Forbzy commented Nov 4, 2020

@BillYoungman Is there no way of just allow us to set the filename our selves, for instance you can do this with the file output plugin when setting the path.

@yogevyuval
Copy link

@Forbzy @BillYoungman @barqshasbite
This feature is very much needed, the plugin can be useless without it for many use cases as can be seen in the different threads.

Any update about this PR? Is there something that can be done to make this happen? Happy to help if needed

@webminster
Copy link

I'd like to upvote this as well... I'm trying to use Logz.io with its S3 bucket shipper, and it wants S3 file names to be in ascending sort order. With the random name, I can't honor that requirement.

@yjagdale
Copy link

yjagdale commented Sep 8, 2023

@barqshasbite - looks like this plugin is no longer maintained. Can you please share a gem file that people can install manually and use?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants