14-第2步_识别文件名中的日期部分

选择背景色：黄橙洋红淡粉水蓝草绿白色选择字体：宋体黑体微软雅黑楷体选择字体大小：小中大特恢复默认

第2步：识别文件名中的日期部分

接下来，程序将循环遍历 os.listdir() 返回的文件名字符串列表，以用这个正则表达式匹配它们。文件名不包含日期的文件将被忽略。如果文件名包含日期，那么匹配的文本将保存在几个变量中。用下面的代码代替程序中的前3个 TODO ：

  #! python3
  # renameDates.py - Renames filenames with American MM-DD-YYYY date format
  # to European DD-MM-YYYY.
  --snip--
  # Loop over the files in the working directory. 
  for amerFilename in os.listdir('.'):
  mo = datePattern.search(amerFilename)
  # Skip files without a date.
❶ if mo == None:
    ❷ continue
❸ # Get the different parts of the filename. 
  beforePart = mo.group(1)
  monthPart = mo.group(2) 
  dayPart = mo.group(4)
  yearPart = mo.group(6)
  afterPart = mo.group(8)
--snip--

如果 search() 方法返回的 Match 对象是 None ❶，那么 amerFilename 中的文件名不匹配该正则表达式。 continue 语句❷将跳过循环剩下的部分，转向下一个文件名。

否则，该正则表达式分组匹配的不同字符串将保存在名为 beforePart 、 monthPart 、 dayPart 、 yearPart 和 afterPart 的变量中❸。这些变量中的字符串将在下一步中使用，用于构成欧洲风格的文件名。

为了让分组编号直观，请尝试从头阅读该正则表达式，每遇到一个左括号就计数加一。不要考虑代码，写下该正则表达式的框架即可。这有助于使分组变得直观，例如：

datePattern = re.compile(r"""^(1) # all text before the date
    (2 (3) )-                      # one or two digits for the month
    (4 (5) )-                      # one or two digits for the day
    (6 (7) )                       # four digits for the year
    (8)$                            # all  text  after  the  date
    """, re.VERBOSE)

这里，编号1至8代表了该正则表达式中的分组。写出该正则表达式的框架，其中只包含括号和分组编号，这会让你更清楚地理解所写的正则表达式。完全理解后再接着看程序中剩下的部分。

Previous 上一篇本节目录 Next 下一篇