Skip to main content

Find The extension of a file using Perl

1 reply [Last post]
sanjeev's picture
Offline
Joined: 21 Feb 2011

My friend asked me how to extract the extension from the given list of files.
There can be many ways but I used one small and simple regular expression to solve his problem.

Regular Expression is : /(.*)\.(.*)/ or  /\.([^.]*)$/
Although it is not very effective and may consume lot of time if number of files will be huge, because it is using greedy search on the pattern two times. We can avoid it using look ahead and look behind features in Regular Expression.

But as of now I didn't find the best solution for it so posting my basic solution for time being and will post the more efficient regular expression, once I get it successfully.

  1.  
  2. #!/usr/bin/perl
  3. use strict;
  4. use warnings;
  5.  
  6. my @files = ("test1.java", "test2.class.txt", "noextension", "many.dots...dots", "old.file_comp.jpeg", "with space.and.dot.mpg", ".dotsonly" );
  7. my @res = ();
  8. foreach my $file(@files){
  9. print "file name is $file and its extension is $2\n" if($file =~ m/(.*)\.(.*)/);
  10. # or you can use another regular expression which seems little better and cleaner than the earlier one
  11. my ($ext) = $file =~ /\.([^.]*)$/;
  12. print "$ext\n" if($ext);
  13. }


Explanation:
I didn't use (.*?)\.  because it will find out the first occurence of . and would return the rest part in second group which we didn't want. But (.*)\. makes it greedy and it comes back from the reverse and stores everything in $1 before the last .  (dot) visited

This could work if only one . would available in a file name but to make it more generic which may contain many dots I used (.*) instead of (.*?)

And second regular expression is bit simpler which checks for first . i.e. \. and then it neglects all dots in that file before it ends the file name i.e ([^\.]*$.  So, If there is more than one dots in a file or even only one dot it will match all such names and $ makes sure that grouping contains only the content after last . (dot) it sees while traversing.

Note:  file.ext means file is a file name without extension and ext representation the extension and . is the separator between them

If you find or know better solution for it, please share with us.
 

Follow us at :
Facebook | Twitter
########### Give me the right place to stand, I shall move the earth. #################

file extension mdi (not verified)
file extension
Superb Blogging sections in this site post by the blogger so thanks for the posting this  thread.

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.