A tool for scraping public LegiFrance registry's naturalisation decrees.
naturalisation par mariage
)*.txt
) containing extracted text from PDFsnatufrance_2000_2021.tsv
) for further processing and analysispip install selenium charset_normalizer
python3 natudump.py
with optional arguments:
-o
specifies output directory--years
specifies range of years to scrape (default: 2000-2021)--output-directory-prefix
sets prefix for output directorypython3 natudump.py -o jo --years $(seq 2000 2021) --output-directory-prefix "$PWD/"
mkdir -p jo; python3 natudump.py -o jo --years $(seq 2000 2021) --output-directory-prefix "$PWD/"
The tool generates the following outputs:
*.txt
) containing extracted text from PDFsnatufrance_2000_2021.tsv
) for further processing and analysiscatjo
directorytarjo
directory> Visit natudump Website <