sparclur.prc package

Analyzer

class sparclur.prc._analyze.Analyzer(files, renderers=[<class 'sparclur.parsers._ghostscript.Ghostscript'>, <class 'sparclur.parsers._mupdf.MuPDF'>, <class 'sparclur.parsers._poppler.Poppler'>, <class 'sparclur.parsers._xpdf.XPDF'>, <class 'sparclur.parsers._pdfium.PDFium'>], metrics='sim', parser_args={}, max_workers=1, timeout=None, overall_timeout=None, recurse=False, base_path=None, progress_bar=True, save_path=None)

Bases: object

Runs pairwise comparisons for the defined renderers over each page of the specified document list or directory

property compare_timeout

Return the set timeout value

property max_workers

Return the set number of max workers

property metrics

List of the metrics to be returned

property overall_timeout

Return the set timeout value

property parser_timeout

Return the set timeout value

property progress_bar

Return the progress bar setting

property renderer_list

List of the renderers to be compared

run()

Return the comparisons for each page of each document from the file list :rtype: List[Dict[str, Any]]

property timeout

Return the set timeout value

PRCViz

class sparclur.prc._viz.PRCViz(doc_path, renderers=[<class 'sparclur.parsers._ghostscript.Ghostscript'>, <class 'sparclur.parsers._mupdf.MuPDF'>, <class 'sparclur.parsers._poppler.Poppler'>, <class 'sparclur.parsers._xpdf.XPDF'>, <class 'sparclur.parsers._pdfium.PDFium'>], parser_args={}, dpi=200, verbose=False)

Bases: object

PDF Render Comparator

Contains methods for comparing and visualizing the renders generated by SPARCLUR wrapped renderers.

display(page, renderers=None, width=10, height=10, save_path=None)

Show the comparison between the specified renderers and the visual difference between them.

Parameters
  • page (int) – The page to visualize

  • renderers (List[tuple] or tuple) – If None, then visualizes all the combinations. Otherwise specify a list of combinations or just one single combination.

  • width (int) – The width of each subplot

  • height (int) – The height of each subplot

  • save_path (str or None) – If None returns the figure for display, otherwise saves the figure to the specified file path.

get_observed_pages()

Return the number of observed pages from the renderers

plot_sims(cmap='tab10', height=10, width=10, save_path=None)

Plot the structural similarity score for each page and each combination of renderers.

Parameters
  • cmap (str or List[str]) – Either a matplotlib cmap name or a specific list of desired colors.

  • height (int) – The height of the matplotlib figure

  • width (int) – The width of the matplotlib figure

  • save_path (str or None) – If None returns the figure for display, otherwise saves the figure to the specified file path.