regex for finding file paths
Use regex(\/.*?\.[\w:]+)
to make regex non-greedy. If you want to find multiple matches in the same line, you can use re.findall().
Update:
Using this code and the example provided, I get:
import re
re.findall(r'(\/.*?\.[\w:]+)', "file path /log/file.txt some lines /log/var/file2.txt")
['/log/file.txt', '/log/var/file2.txt']
Regex for parsing directory and filename
Try this:
^(.+)\/([^\/]+)$
EDIT: escaped the forward slash to prevent problems when copy/pasting the Regex
RegEx to find Windows file paths inside of text
You may use this regex to capture folder and filename in 2 separate capture groups:
(?:\\\\[^\\]+|[a-zA-Z]:)((?:\\[^\\]+)+\\)?([^<>:]*)
RegEx Demo
RegEx Details:
(?:\\\\[^\\]+|[a-zA-Z]:)
: Match either a server name or IP address that starts with\\
followed by 1+ non-\
characters OR a drive letter followed by a:
in a non-capturing group((?:\\[^\\]+)+\\)?
: 1st capture group for folder path that matches a string starting with a\
and matches 1+ non-\
characters allowing multiple occurrences of that followed by a\
. This group is optional due to presence of?
in the end.([^<>:]*)
: Match filename that 0 or more of any character that is not<
,>
and:
Regular expression to match a file path with certain prefix
You don't need groups and lookaheads - it's only regex match in mongo. The query could be as simple as
db.collection.find({fieldname:/^\/abcd\/[^\/]+$/})
Regex to find directory in text
You can use a regex like this:
\/.*\.[\w:]+
Working demo
Btw, if you want to allow backslashes in the path you can have:
[\\\/].*\.[\w:]+
Regex for extracting part of a file path
If here we wish to capture the /
, then we might just want to try ([\/]+)
. There should be other expressions to extract one
also, such as:
(?:\/[a-z]+\/)(.+?)(?:\/.+)
and our code might look like:
regexp_extract(filepath, '(?:\/[a-z]+\/)(.+?)(?:\/.+)', 2)
or
regexp_extract(filepath, '(?:\/.+?\/)(.+?)(?:\/.+)', 2)
Compartments
In this case, we are not capturing what is behind one
using a non-capturing group:
(?:\/[a-z]+\/)
then we capture one
using:
(.+?)
and finally we add a right boundary after one
in another non-capturing group:
(?:\/.+)
RegEx Circuit
jex.im visualizes regular expressions:
DEMO
Depending on which slash, one
might be located, we can modify our expression. For example, in this case, this expression also might be working:
(?:\/.+?\/)(.+?)(?:\/.+)
DEMO
Related Topics
Filter All Rows That Do Not Contain Letters (Alpha) in 'Pandas'
Python - Regex Match Multiple Patterns in Multiple Lines
Python: How to Turn CSV Data in to Array
Programme to Print Mulitples of 5 in a Range Specified by User
How to Get the Response Json Data from Network Call in Xhr Using Python Selenium Web Driver Chorme
Find the Index of a Value in a 2D Array
Pandas Convert from Datetime to Integer Timestamp
How to Get One Key and Value from a Json in Python
How to Find a Minimum Value in a 2D Array Without Using Numpy or Flattened in Python
Making a Matrix in Python 3 Without Numpy Using Inputs
Making a Dictionary from Each Line in a File
How to Read a Specific Line from a Text File in Python
How to Name a File by a Variable Name in Python
How to Iterate Through a List of Dictionaries in Jinja Template
Keep Other Columns When Doing Groupby