{"id":1336,"date":"2015-09-28T00:23:28","date_gmt":"2015-09-28T04:23:28","guid":{"rendered":"http:\/\/agileadam.com\/?p=1336"},"modified":"2021-08-02T17:00:45","modified_gmt":"2021-08-02T21:00:45","slug":"automatic-screenshots","status":"publish","type":"post","link":"https:\/\/agileadam.com\/2015\/09\/automatic-screenshots\/","title":{"rendered":"Automatic Screenshots of Drupal Content"},"content":{"rendered":"
In an earlier post<\/a> I recommended webkit2png<\/em> for automatically screenshotting a list of URLs. A lot of time has passed since that post, and I’ve discovered a more robust tool.\u00a0Pageres<\/a>\u00a0is incredible, and it has a CLI and an api.<\/p>\n I’ll let you discover, on your own, what the\u00a0Pageres<\/em>\u00a0tool can do. I needed to take screenshots of all of the content types on a site, at all of the important resolutions. Here’s a quick Drupal function I threw together to get N\u00a0number of random nodes per content type:<\/p>\n The function spits out a list of URLs ready for usage with\u00a0pageres<\/em>. Simply save the results to a txt file (urls.txt<\/em> in my example below).<\/p>\n Here’s the pageres command I used to generate the screenshots:<\/p>\n Why the 100-pixel height? Well, the height doesn’t really matter unless you enable cropping. I use 100 on all of them so that it’s obvious the value doesn’t mean anything. I tried 1200×1 but it breaks pageres. 1200×100 works perfectly.<\/p>\n How about another quick function? Here’s one to generate a list of URLs within a menu:<\/p>\n Now, how does this handle many URLs? Well, unfortunately not that well. Python comes to the rescue in just a few lines of simple code. This will process one URL at a time, generating all resolutions for each URL. I’m certain this could be better (filename should be an argument, for example), but it gets the job done.<\/p>\n UPDATE #1:<\/strong> Here’s a rough draft of a Python script that is a little more robust than the code above. It still lacks some niceties, but I’ll just wait until next time I need it to make improvements.<\/p>\n You would execute this like: python ~\/repos\/pageres_capture\/pageres_capture.py urls.txt<\/span><\/p>\n UPDATE\u00a0#2:\u00a0<\/strong>Here’s a version that appends the URL to the top of the screenshot using ImageMagick. You can turn it off using –no-overlay. As with the code above, this is alpha code. As I’m looking at it it’s clear I should make “sizes” an argument\/switch. In fact, I should probably allow several of the pageres options.<\/p>\n This requires ImageMagick. Before running, you must be able to run mogrify<\/span>\u00a0 successfully\u00a0from the command line.<\/p>\n Update #3:<\/strong> Same as above but corrects behavior if a URL is not accessible (and shows an error as it encounters those). This still only works for Python 2.7:<\/p>\n <\/p>\n","protected":false},"excerpt":{"rendered":" In an earlier post I recommended webkit2png for automatically screenshotting a list of URLs. A lot of time has passed since that post, and I’ve discovered a more robust tool.\u00a0Pageres\u00a0is incredible, and it has a CLI and an api. I’ll let you discover, on your own, what the\u00a0Pageres\u00a0tool can do. I needed to take screenshots of all of the content types on a site, at all of the important resolutions. Here’s a quick Drupal function I threw together to get N\u00a0number of random nodes per content type:<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[79,76],"tags":[219,83],"_links":{"self":[{"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/posts\/1336"}],"collection":[{"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/comments?post=1336"}],"version-history":[{"count":22,"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/posts\/1336\/revisions"}],"predecessor-version":[{"id":2943,"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/posts\/1336\/revisions\/2943"}],"wp:attachment":[{"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/media?parent=1336"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/categories?post=1336"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agileadam.com\/wp-json\/wp\/v2\/tags?post=1336"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}function generate_random_node_urls_by_type($num_per_type = 3, $include_type = FALSE, $alias = FALSE, $node_types = array()) {\r\n $output = '';\r\n if (empty($node_types)) {\r\n foreach (node_type_get_types() as $type) {\r\n $node_types[] = $type->type;\r\n }\r\n }\r\n foreach ($node_types as $node_type) {\r\n $result = db_query_range('SELECT n.nid as nid, ua.alias as alias\r\n FROM {node} n\r\n LEFT JOIN {url_alias} ua ON ua.source = CONCAT(\\'node\/\\', n.nid)\r\n WHERE n.type = :ntype\r\n ORDER BY RAND()', 0, $num_per_type, array(':ntype' => $node_type));\r\n if ($result) {\r\n while ($row = $result->fetchAssoc()) {\r\n if ($include_type) {\r\n $output .= str_pad($node_type, 35);\r\n }\r\n if ($alias && $row['alias']) {\r\n $output .= $GLOBALS['base_url']. '\/' . $row['alias'] . \"\\n\";\r\n }\r\n else {\r\n $output .= $GLOBALS['base_url'] . '\/node\/' . $row['nid'] . \"\\n\";\r\n }\r\n }\r\n }\r\n }\r\n \r\n return $output;\r\n}\r\n\r\n\/\/ Example 1: 10 of each specific node type:\r\ndpm(generate_random_node_urls_by_type(10, FALSE, TRUE, array('homepage_feature', 'page')));\r\n\r\n\/\/ Example 2: 5 of every node type:\r\ndpm(generate_random_node_urls_by_type(5, FALSE, TRUE));<\/pre>\n
pageres --delay 1 --header='Cache-Control: no-cache' --filename=\"<%= date %> - <%= url %> - <%= size %>\" 1200x100 1024x100 768x100 520x100 320x100 < urls.txt<\/pre>\n
function generate_node_urls_in_menu($menu_name, $alias = FALSE) {\r\n $output = '';\r\n $result = db_query('SELECT m.link_path as link_path, ua.alias as alias\r\n FROM {menu_links} m\r\n INNER JOIN {url_alias} ua ON ua.source = m.link_path\r\n WHERE menu_name = :mname', array(':mname' => $menu_name));\r\n if ($result) {\r\n while ($row = $result->fetchAssoc()) {\r\n if ($alias) {\r\n $output .= $GLOBALS['base_url']. '\/' . $row['alias'] . \"\\n\";\r\n }\r\n else {\r\n $output .= $GLOBALS['base_url']. '\/' . $row['link_path'] . \"\\n\";\r\n }\r\n }\r\n }\r\n\r\n return $output;\r\n}\r\n\r\ndpm(generate_node_urls_in_menu('menu-for-undergraduates', TRUE));<\/pre>\n
import subprocess\r\nwith open(\"urls.txt\", \"r\") as file:\r\n for line in file:\r\n print \"Generating screenshots for\", line\r\n p = subprocess.Popen(\"pageres --header='Cache-Control: no-cache' --filename='<%= date %> - <%= url %> - <%= size %>' 1200x100 1024x100 768x100 520x100 320x100\",\r\n shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)\r\n p.stdin.write(line)\r\n p.communicate()[0]\r\n p.stdin.close()<\/pre>\n
#!\/usr\/bin\/env python\r\n\r\nimport argparse\r\nimport subprocess\r\nimport logging\r\nimport sys\r\n\r\n# Example:\r\n# sizes = \"1200x100 1024x100 768x100 520x100 320x100\"\r\nsizes = \"1200x100\"\r\n\r\nLOG = logging.getLogger(__name__)\r\nLOG.setLevel(logging.DEBUG)\r\nformatter = logging.Formatter(\"%(asctime)s [%(levelname)s] %(message)s\", \"%Y-%m-%d %H:%M:%S\")\r\n\r\n# Console logging\r\nch = logging.StreamHandler(sys.stdout)\r\nch.setLevel(logging.INFO)\r\nch.setFormatter(formatter)\r\nLOG.addHandler(ch)\r\n\r\nparser = argparse.ArgumentParser(description='Captures screenshots of URLs from a file using Pageres', version='1.0', add_help=True)\r\nparser.add_argument('inputfile', action=\"store\", type=file)\r\nargs = parser.parse_args()\r\n\r\n# loop through all of the lines in the input file and process them\r\nlines = args.inputfile.read().splitlines()\r\n\r\ni = 0\r\nfor line in lines:\r\n # Increase the line number by one for our user messages\r\n i += 1\r\n\r\n # Clean the line\r\n lineclean = line.strip()\r\n\r\n if lineclean == '':\r\n LOG.info('Line %d - Ignoring blank line' % i)\r\n continue\r\n\r\n LOG.info('Line %d - Capturing %s' % (i, lineclean))\r\n p = subprocess.Popen(\"pageres --header='Cache-Control: no-cache' --filename='<%= date %> - <%= url %> - <%= size %>' \" + sizes,\r\n shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)\r\n p.stdin.write(line)\r\n p.communicate()[0]\r\n p.stdin.close()<\/pre>\n
#!\/usr\/bin\/env python\r\n\r\nimport argparse\r\nimport subprocess\r\nimport logging\r\nimport sys\r\n\r\n# Example:\r\n# sizes = \"1200x100 1024x100 768x100 520x100 320x100\"\r\nsizes = \"1200x100\"\r\n\r\nLOG = logging.getLogger(__name__)\r\nLOG.setLevel(logging.DEBUG)\r\nformatter = logging.Formatter(\"%(asctime)s [%(levelname)s] %(message)s\", \"%Y-%m-%d %H:%M:%S\")\r\n\r\n# Console logging\r\nch = logging.StreamHandler(sys.stdout)\r\nch.setLevel(logging.INFO)\r\nch.setFormatter(formatter)\r\nLOG.addHandler(ch)\r\n\r\nparser = argparse.ArgumentParser(description='Captures screenshots of URLs from a file using Pageres', version='1.0', add_help=True)\r\nparser.add_argument('inputfile', action='store', type=file)\r\nparser.add_argument('--no-overlay', help='Do not add URL overlay', action='store_true')\r\nargs = parser.parse_args()\r\n\r\n# loop through all of the lines in the input file and process them\r\nlines = args.inputfile.read().splitlines()\r\n\r\ni = 0\r\nfor line in lines:\r\n # Increase the line number by one for our user messages\r\n i += 1\r\n\r\n # Clean the line\r\n lineclean = line.strip()\r\n\r\n if lineclean == '':\r\n LOG.info('Line %d - Ignoring blank line' % i)\r\n continue\r\n\r\n LOG.info('Line %d - Capturing %s' % (i, lineclean))\r\n p = subprocess.Popen(\"pageres --header='Cache-Control: no-cache' --filename='<%= date %> - <%= url %> - <%= size %>' \" + sizes,\r\n shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)\r\n p.stdin.write(line)\r\n p.communicate()[0]\r\n p.stdin.close()\r\n\r\n if not args.no_overlay:\r\n p = subprocess.Popen('OUTPUT=\"$(ls -Art | tail -n 1)\"; mogrify -pointsize 14 -background Gold -gravity North -splice 0x18 -annotate +0+2 \\'%s\\' \"${OUTPUT}\"' % lineclean,\r\n shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)\r\n p.stdin.write(line)\r\n p.communicate()[0]\r\n p.stdin.close()<\/pre>\n
#!\/usr\/bin\/env python\r\n\r\nimport argparse\r\nimport subprocess\r\nimport logging\r\nimport sys\r\nfrom urllib import urlopen\r\n\r\n# Example:\r\n# sizes = \"1200x100 1024x100 768x100 520x100 320x100\"\r\nsizes = \"1200x1200\"\r\n\r\n# CLI arguments from https:\/\/www.npmjs.com\/package\/pageres-cli\r\n# Example:\r\n# options = \"--header='Cache-Control: no-cache' --filename='<%= date %> - <%= url %> - <%= size %>'\"\r\noptions = \"--format=png --header='Cache-Control: no-cache' --filename='<%= date %> - <%= url %> - <%= size %>'\"\r\n\r\nLOG = logging.getLogger(__name__)\r\nLOG.setLevel(logging.DEBUG)\r\nformatter = logging.Formatter(\"%(asctime)s [%(levelname)s] %(message)s\", \"%Y-%m-%d %H:%M:%S\")\r\n\r\n# Console logging\r\nch = logging.StreamHandler(sys.stdout)\r\nch.setLevel(logging.INFO)\r\nch.setFormatter(formatter)\r\nLOG.addHandler(ch)\r\n\r\nparser = argparse.ArgumentParser(description='Captures screenshots of URLs from a file using Pageres', version='1.0', add_help=True)\r\nparser.add_argument('inputfile', action='store', type=file)\r\nparser.add_argument('--no-overlay', help='Do not add URL overlay', action='store_true')\r\nargs = parser.parse_args()\r\n\r\n# Loop through all of the lines in the input file and process them\r\nlines = args.inputfile.read().splitlines()\r\n\r\ni = 0\r\nfor line in lines:\r\n # Increase the line number by one for our user messages\r\n i += 1\r\n\r\n lineclean = line.strip()\r\n if lineclean == '':\r\n LOG.info('Line %d - Ignoring blank line' % i)\r\n continue\r\n\r\n try:\r\n urlopen(lineclean).getcode()\r\n except:\r\n LOG.error('Line %d - Error capturing %s' % (i, lineclean))\r\n continue\r\n\r\n LOG.info('Line %d - Capturing %s' % (i, lineclean))\r\n p = subprocess.Popen(\"pageres \\\"\" + lineclean + \"\\\" \" + options + \" \" + sizes,\r\n shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)\r\n p.communicate()[0]\r\n p.stdin.close()\r\n\r\n if not args.no_overlay:\r\n p = subprocess.Popen('OUTPUT=\"$(ls -Art | tail -n 1)\"; mogrify -pointsize 14 -background Gold -gravity North -splice 0x18 -annotate +0+2 \\'%s\\' \"${OUTPUT}\"' % lineclean,\r\n shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)\r\n p.stdin.write(lineclean)\r\n p.communicate()[0]\r\n p.stdin.close()\r\n<\/pre>\n