AnsiballZ improvements

Now that we don't need to worry about python-2.4 and 2.5, we can make
some improvements to the way AnsiballZ handles modules.

* Change AnsiballZ wrapper to use import to invoke the module
  We need the module to think of itself as a script because it could be
  coded as:

      main()

  or as:

      if __name__ == '__main__':
          main()

  Or even as:

      if __name__ == '__main__':
          random_function_name()

  A script will invoke all of those.  Prior to this change, we invoked
  a second Python interpreter on the module so that it really was
  a script.  However, this means that we have to run python twice (once
  for the AnsiballZ wrapper and once for the module).  This change makes
  the module think that it is a script (because __name__ in the module ==
  '__main__') but it's actually being invoked by us importing the module
  code.

  There's three ways we've come up to do this.
  * The most elegant is to use zipimporter and tell the import mechanism
    that the module being loaded is __main__:
    * 5959f11c9d/lib/ansible/executor/module_common.py (L175)
    * zipimporter is nice because we do not have to extract the module from
      the zip file and save it to the disk when we do that.  The import
      machinery does it all for us.
    * The drawback is that modules do not have a __file__ which points
      to a real file when they do this.  Modules could be using __file__
      to for a variety of reasons, most of those probably have
      replacements (the most common one is to find a writable directory
      for temporary files.  AnsibleModule.tmpdir should be used instead)
      We can monkeypatch __file__ in fom AnsibleModule initialization
      but that's kind of gross.  There's no way I can see to do this
      from the wrapper.

  * Next, there's imp.load_module():
    * 340edf7489/lib/ansible/executor/module_common.py (L151)
    * imp has the nice property of allowing us to set __name__ to
      __main__ without changing the name of the file itself
    * We also don't have to do anything special to set __file__ for
      backwards compatibility (although the reason for that is the
      drawback):
    * Its drawback is that it requires the file to exist on disk so we
      have to explicitly extract it from the zipfile and save it to
      a temporary file

  * The last choice is to use exec to execute the module:
    * f47a4ccc76/lib/ansible/executor/module_common.py (L175)
    * The code we would have to maintain for this looks pretty clean.
      In the wrapper we create a ModuleType, set __file__ on it, read
      the module's contents in from the zip file and then exec it.
    * Drawbacks: We still have to explicitly extract the file's contents
      from the zip archive instead of letting python's import mechanism
      handle it.
    * Exec also has hidden performance issues and breaks certain
      assumptions that modules could be making about their own code:
      http://lucumr.pocoo.org/2011/2/1/exec-in-python/

  Our plan is to use imp.load_module() for now, deprecate the use of
  __file__ in modules, and switch to zipimport once the deprecation
  period for __file__ is over (without monkeypatching a fake __file__ in
  via AnsibleModule).

* Rename the name of the AnsiBallZ wrapped module
  This makes it obvious that the wrapped module isn't the module file that
  we distribute.  It's part of trying to mitigate the fact that the module
  is now named __main)).py in tracebacks.

* Shield all wrapper symbols inside of a function
  With the new import code, all symbols in the wrapper become visible in
  the module.  To mitigate the chance of collisions, move most symbols
  into a toplevel function.  The only symbols left in the global namespace
  are now _ANSIBALLZ_WRAPPER and _ansiballz_main.

revised porting guide entry

Integrate code coverage collection into AnsiballZ.

ci_coverage
ci_complete
This commit is contained in:
Toshio Kuratomi 2018-06-20 11:23:59 -07:00
parent ec20d4b13e
commit 52449cc01a
32 changed files with 349 additions and 321 deletions

View file

@ -70,7 +70,7 @@ _MODULE_UTILS_PATH = os.path.join(os.path.dirname(__file__), '..', 'module_utils
ANSIBALLZ_TEMPLATE = u'''%(shebang)s
%(coding)s
ANSIBALLZ_WRAPPER = True # For test-module script to tell this is a ANSIBALLZ_WRAPPER
_ANSIBALLZ_WRAPPER = True # For test-module script to tell this is a ANSIBALLZ_WRAPPER
# This code is part of Ansible, but is an independent component.
# The code in this particular templatable string, and this templatable string
# only, is BSD licensed. Modules which end up using this snippet, which is
@ -98,201 +98,193 @@ ANSIBALLZ_WRAPPER = True # For test-module script to tell this is a ANSIBALLZ_WR
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
# USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import os
import os.path
import sys
import __main__
def _ansiballz_main():
import os
import os.path
import sys
import __main__
# For some distros and python versions we pick up this script in the temporary
# directory. This leads to problems when the ansible module masks a python
# library that another import needs. We have not figured out what about the
# specific distros and python versions causes this to behave differently.
#
# Tested distros:
# Fedora23 with python3.4 Works
# Ubuntu15.10 with python2.7 Works
# Ubuntu15.10 with python3.4 Fails without this
# Ubuntu16.04.1 with python3.5 Fails without this
# To test on another platform:
# * use the copy module (since this shadows the stdlib copy module)
# * Turn off pipelining
# * Make sure that the destination file does not exist
# * ansible ubuntu16-test -m copy -a 'src=/etc/motd dest=/var/tmp/m'
# This will traceback in shutil. Looking at the complete traceback will show
# that shutil is importing copy which finds the ansible module instead of the
# stdlib module
scriptdir = None
try:
scriptdir = os.path.dirname(os.path.realpath(__main__.__file__))
except (AttributeError, OSError):
# Some platforms don't set __file__ when reading from stdin
# OSX raises OSError if using abspath() in a directory we don't have
# permission to read (realpath calls abspath)
pass
if scriptdir is not None:
sys.path = [p for p in sys.path if p != scriptdir]
# For some distros and python versions we pick up this script in the temporary
# directory. This leads to problems when the ansible module masks a python
# library that another import needs. We have not figured out what about the
# specific distros and python versions causes this to behave differently.
#
# Tested distros:
# Fedora23 with python3.4 Works
# Ubuntu15.10 with python2.7 Works
# Ubuntu15.10 with python3.4 Fails without this
# Ubuntu16.04.1 with python3.5 Fails without this
# To test on another platform:
# * use the copy module (since this shadows the stdlib copy module)
# * Turn off pipelining
# * Make sure that the destination file does not exist
# * ansible ubuntu16-test -m copy -a 'src=/etc/motd dest=/var/tmp/m'
# This will traceback in shutil. Looking at the complete traceback will show
# that shutil is importing copy which finds the ansible module instead of the
# stdlib module
scriptdir = None
try:
scriptdir = os.path.dirname(os.path.realpath(__main__.__file__))
except (AttributeError, OSError):
# Some platforms don't set __file__ when reading from stdin
# OSX raises OSError if using abspath() in a directory we don't have
# permission to read (realpath calls abspath)
pass
if scriptdir is not None:
sys.path = [p for p in sys.path if p != scriptdir]
import base64
import shutil
import zipfile
import tempfile
import subprocess
import base64
import shutil
import tempfile
import zipimport
import zipfile
if sys.version_info < (3,):
bytes = str
PY3 = False
else:
unicode = str
PY3 = True
ZIPDATA = """%(zipdata)s"""
def invoke_module(module, modlib_path, json_params):
pythonpath = os.environ.get('PYTHONPATH')
if pythonpath:
os.environ['PYTHONPATH'] = ':'.join((modlib_path, pythonpath))
if sys.version_info < (3,):
bytes = str
PY3 = False
else:
os.environ['PYTHONPATH'] = modlib_path
unicode = str
PY3 = True
p = subprocess.Popen([%(interpreter)s, module], env=os.environ, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
(stdout, stderr) = p.communicate(json_params)
ZIPDATA = """%(zipdata)s"""
if not isinstance(stderr, (bytes, unicode)):
stderr = stderr.read()
if not isinstance(stdout, (bytes, unicode)):
stdout = stdout.read()
if PY3:
sys.stderr.buffer.write(stderr)
sys.stdout.buffer.write(stdout)
else:
sys.stderr.write(stderr)
sys.stdout.write(stdout)
return p.returncode
def invoke_module(modlib_path, json_params):
# When installed via setuptools (including python setup.py install),
# ansible may be installed with an easy-install.pth file. That file
# may load the system-wide install of ansible rather than the one in
# the module. sitecustomize is the only way to override that setting.
z = zipfile.ZipFile(modlib_path, mode='a')
def debug(command, zipped_mod, json_params):
# The code here normally doesn't run. It's only used for debugging on the
# remote machine.
#
# The subcommands in this function make it easier to debug ansiballz
# modules. Here's the basic steps:
#
# Run ansible with the environment variable: ANSIBLE_KEEP_REMOTE_FILES=1 and -vvv
# to save the module file remotely::
# $ ANSIBLE_KEEP_REMOTE_FILES=1 ansible host1 -m ping -a 'data=october' -vvv
#
# Part of the verbose output will tell you where on the remote machine the
# module was written to::
# [...]
# <host1> SSH: EXEC ssh -C -q -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o
# PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o
# ControlPath=/home/badger/.ansible/cp/ansible-ssh-%%h-%%p-%%r -tt rhel7 '/bin/sh -c '"'"'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8
# LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/badger/.ansible/tmp/ansible-tmp-1461173013.93-9076457629738/ping'"'"''
# [...]
#
# Login to the remote machine and run the module file via from the previous
# step with the explode subcommand to extract the module payload into
# source files::
# $ ssh host1
# $ /usr/bin/python /home/badger/.ansible/tmp/ansible-tmp-1461173013.93-9076457629738/ping explode
# Module expanded into:
# /home/badger/.ansible/tmp/ansible-tmp-1461173408.08-279692652635227/ansible
#
# You can now edit the source files to instrument the code or experiment with
# different parameter values. When you're ready to run the code you've modified
# (instead of the code from the actual zipped module), use the execute subcommand like this::
# $ /usr/bin/python /home/badger/.ansible/tmp/ansible-tmp-1461173013.93-9076457629738/ping execute
# py3: modlib_path will be text, py2: it's bytes. Need bytes at the end
sitecustomize = u'import sys\\nsys.path.insert(0,"%%s")\\n' %% modlib_path
sitecustomize = sitecustomize.encode('utf-8')
# Use a ZipInfo to work around zipfile limitation on hosts with
# clocks set to a pre-1980 year (for instance, Raspberry Pi)
zinfo = zipfile.ZipInfo()
zinfo.filename = 'sitecustomize.py'
zinfo.date_time = ( %(year)i, %(month)i, %(day)i, %(hour)i, %(minute)i, %(second)i)
z.writestr(zinfo, sitecustomize)
z.close()
# Okay to use __file__ here because we're running from a kept file
basedir = os.path.join(os.path.abspath(os.path.dirname(__file__)), 'debug_dir')
args_path = os.path.join(basedir, 'args')
script_path = os.path.join(basedir, 'ansible_module_%(ansible_module)s.py')
# Put the zipped up module_utils we got from the controller first in the python path so that we
# can monkeypatch the right basic
sys.path.insert(0, modlib_path)
if command == 'explode':
# transform the ZIPDATA into an exploded directory of code and then
# print the path to the code. This is an easy way for people to look
# at the code on the remote machine for debugging it in that
# environment
z = zipfile.ZipFile(zipped_mod)
for filename in z.namelist():
if filename.startswith('/'):
raise Exception('Something wrong with this module zip file: should not contain absolute paths')
# Monkeypatch the parameters into basic
from ansible.module_utils import basic
basic._ANSIBLE_ARGS = json_params
%(coverage)s
# Run the module! By importing it as '__main__', it thinks it is executing as a script
importer = zipimport.zipimporter(modlib_path)
importer.load_module('__main__')
dest_filename = os.path.join(basedir, filename)
if dest_filename.endswith(os.path.sep) and not os.path.exists(dest_filename):
os.makedirs(dest_filename)
else:
directory = os.path.dirname(dest_filename)
if not os.path.exists(directory):
os.makedirs(directory)
f = open(dest_filename, 'wb')
f.write(z.read(filename))
f.close()
# write the args file
f = open(args_path, 'wb')
f.write(json_params)
f.close()
print('Module expanded into:')
print('%%s' %% basedir)
exitcode = 0
elif command == 'execute':
# Execute the exploded code instead of executing the module from the
# embedded ZIPDATA. This allows people to easily run their modified
# code on the remote machine to see how changes will affect it.
# This differs slightly from default Ansible execution of Python modules
# as it passes the arguments to the module via a file instead of stdin.
# Set pythonpath to the debug dir
pythonpath = os.environ.get('PYTHONPATH')
if pythonpath:
os.environ['PYTHONPATH'] = ':'.join((basedir, pythonpath))
else:
os.environ['PYTHONPATH'] = basedir
p = subprocess.Popen([%(interpreter)s, script_path, args_path],
env=os.environ, shell=False, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, stdin=subprocess.PIPE)
(stdout, stderr) = p.communicate()
if not isinstance(stderr, (bytes, unicode)):
stderr = stderr.read()
if not isinstance(stdout, (bytes, unicode)):
stdout = stdout.read()
if PY3:
sys.stderr.buffer.write(stderr)
sys.stdout.buffer.write(stdout)
else:
sys.stderr.write(stderr)
sys.stdout.write(stdout)
return p.returncode
elif command == 'excommunicate':
# This attempts to run the module in-process (by importing a main
# function and then calling it). It is not the way ansible generally
# invokes the module so it won't work in every case. It is here to
# aid certain debuggers which work better when the code doesn't change
# from one process to another but there may be problems that occur
# when using this that are only artifacts of how we're invoking here,
# not actual bugs (as they don't affect the real way that we invoke
# ansible modules)
# stub the args and python path
sys.argv = ['%(ansible_module)s', args_path]
sys.path.insert(0, basedir)
from ansible_module_%(ansible_module)s import main
main()
print('WARNING: Module returned to wrapper instead of exiting')
# Ansible modules must exit themselves
print('{"msg": "New-style module did not handle its own exit", "failed": true}')
sys.exit(1)
else:
print('WARNING: Unknown debug command. Doing nothing.')
exitcode = 0
return exitcode
def debug(command, zipped_mod, json_params):
# The code here normally doesn't run. It's only used for debugging on the
# remote machine.
#
# The subcommands in this function make it easier to debug ansiballz
# modules. Here's the basic steps:
#
# Run ansible with the environment variable: ANSIBLE_KEEP_REMOTE_FILES=1 and -vvv
# to save the module file remotely::
# $ ANSIBLE_KEEP_REMOTE_FILES=1 ansible host1 -m ping -a 'data=october' -vvv
#
# Part of the verbose output will tell you where on the remote machine the
# module was written to::
# [...]
# <host1> SSH: EXEC ssh -C -q -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o
# PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o
# ControlPath=/home/badger/.ansible/cp/ansible-ssh-%%h-%%p-%%r -tt rhel7 '/bin/sh -c '"'"'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8
# LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/badger/.ansible/tmp/ansible-tmp-1461173013.93-9076457629738/ping'"'"''
# [...]
#
# Login to the remote machine and run the module file via from the previous
# step with the explode subcommand to extract the module payload into
# source files::
# $ ssh host1
# $ /usr/bin/python /home/badger/.ansible/tmp/ansible-tmp-1461173013.93-9076457629738/ping explode
# Module expanded into:
# /home/badger/.ansible/tmp/ansible-tmp-1461173408.08-279692652635227/ansible
#
# You can now edit the source files to instrument the code or experiment with
# different parameter values. When you're ready to run the code you've modified
# (instead of the code from the actual zipped module), use the execute subcommand like this::
# $ /usr/bin/python /home/badger/.ansible/tmp/ansible-tmp-1461173013.93-9076457629738/ping execute
# Okay to use __file__ here because we're running from a kept file
basedir = os.path.join(os.path.abspath(os.path.dirname(__file__)), 'debug_dir')
args_path = os.path.join(basedir, 'args')
script_path = os.path.join(basedir, '__main__.py')
if command == 'excommunicate':
print('The excommunicate debug command is deprecated and will be removed in 2.11. Use execute instead.')
command = 'execute'
if command == 'explode':
# transform the ZIPDATA into an exploded directory of code and then
# print the path to the code. This is an easy way for people to look
# at the code on the remote machine for debugging it in that
# environment
z = zipfile.ZipFile(zipped_mod)
for filename in z.namelist():
if filename.startswith('/'):
raise Exception('Something wrong with this module zip file: should not contain absolute paths')
dest_filename = os.path.join(basedir, filename)
if dest_filename.endswith(os.path.sep) and not os.path.exists(dest_filename):
os.makedirs(dest_filename)
else:
directory = os.path.dirname(dest_filename)
if not os.path.exists(directory):
os.makedirs(directory)
f = open(dest_filename, 'wb')
f.write(z.read(filename))
f.close()
# write the args file
f = open(args_path, 'wb')
f.write(json_params)
f.close()
print('Module expanded into:')
print('%%s' %% basedir)
exitcode = 0
elif command == 'execute':
# Execute the exploded code instead of executing the module from the
# embedded ZIPDATA. This allows people to easily run their modified
# code on the remote machine to see how changes will affect it.
# Set pythonpath to the debug dir
sys.path.insert(0, basedir)
# read in the args file which the user may have modified
with open(args_path, 'rb') as f:
json_params = f.read()
# Monkeypatch the parameters into basic
from ansible.module_utils import basic
basic._ANSIBLE_ARGS = json_params
# Run the module! By importing it as '__main__', it thinks it is executing as a script
import imp
with open(script_path, 'r') as f:
importer = imp.load_module('__main__', f, script_path, ('.py', 'r', imp.PY_SOURCE))
# Ansible modules must exit themselves
print('{"msg": "New-style module did not handle its own exit", "failed": true}')
sys.exit(1)
else:
print('WARNING: Unknown debug command. Doing nothing.')
exitcode = 0
return exitcode
if __name__ == '__main__':
#
# See comments in the debug() method for information on debugging
#
@ -306,38 +298,14 @@ if __name__ == '__main__':
# store this in remote_tmpdir (use system tempdir instead)
temp_path = tempfile.mkdtemp(prefix='ansible_')
zipped_mod = os.path.join(temp_path, 'ansible_modlib.zip')
modlib = open(zipped_mod, 'wb')
modlib.write(base64.b64decode(ZIPDATA))
modlib.close()
zipped_mod = os.path.join(temp_path, 'ansible_%(ansible_module)s_payload.zip')
with open(zipped_mod, 'wb') as modlib:
modlib.write(base64.b64decode(ZIPDATA))
if len(sys.argv) == 2:
exitcode = debug(sys.argv[1], zipped_mod, ANSIBALLZ_PARAMS)
else:
z = zipfile.ZipFile(zipped_mod, mode='r')
module = os.path.join(temp_path, 'ansible_module_%(ansible_module)s.py')
f = open(module, 'wb')
f.write(z.read('ansible_module_%(ansible_module)s.py'))
f.close()
# When installed via setuptools (including python setup.py install),
# ansible may be installed with an easy-install.pth file. That file
# may load the system-wide install of ansible rather than the one in
# the module. sitecustomize is the only way to override that setting.
z = zipfile.ZipFile(zipped_mod, mode='a')
# py3: zipped_mod will be text, py2: it's bytes. Need bytes at the end
sitecustomize = u'import sys\\nsys.path.insert(0,"%%s")\\n' %% zipped_mod
sitecustomize = sitecustomize.encode('utf-8')
# Use a ZipInfo to work around zipfile limitation on hosts with
# clocks set to a pre-1980 year (for instance, Raspberry Pi)
zinfo = zipfile.ZipInfo()
zinfo.filename = 'sitecustomize.py'
zinfo.date_time = ( %(year)i, %(month)i, %(day)i, %(hour)i, %(minute)i, %(second)i)
z.writestr(zinfo, sitecustomize)
z.close()
exitcode = invoke_module(module, zipped_mod, ANSIBALLZ_PARAMS)
invoke_module(zipped_mod, ANSIBALLZ_PARAMS)
finally:
try:
shutil.rmtree(temp_path)
@ -345,6 +313,33 @@ if __name__ == '__main__':
# tempdir creation probably failed
pass
sys.exit(exitcode)
if __name__ == '__main__':
_ansiballz_main()
'''
ANSIBALLZ_COVERAGE_TEMPLATE = '''
# Access to the working directory is required by coverage.
# Some platforms, such as macOS, may not allow querying the working directory when using become to drop privileges.
try:
os.getcwd()
except OSError:
os.chdir('/')
os.environ['COVERAGE_FILE'] = '%(coverage_output)s'
import atexit
import coverage
cov = coverage.Coverage(config_file='%(coverage_config)s')
def atexit_coverage():
cov.stop()
cov.save()
atexit.register(atexit_coverage)
cov.start()
'''
@ -759,7 +754,7 @@ def _find_module_utils(module_name, b_module_data, module_path, module_args, tas
to_bytes(__author__) + b'"\n')
zf.writestr('ansible/module_utils/__init__.py', b'from pkgutil import extend_path\n__path__=extend_path(__path__,__name__)\n')
zf.writestr('ansible_module_%s.py' % module_name, b_module_data)
zf.writestr('__main__.py', b_module_data)
py_module_cache = {('__init__',): (b'', '[builtin]')}
recursive_finder(module_name, b_module_data, py_module_names, py_module_cache, zf)
@ -805,6 +800,18 @@ def _find_module_utils(module_name, b_module_data, module_path, module_args, tas
interpreter_parts = interpreter.split(u' ')
interpreter = u"'{0}'".format(u"', '".join(interpreter_parts))
coverage_config = os.environ.get('_ANSIBLE_COVERAGE_CONFIG')
if coverage_config:
# Enable code coverage analysis of the module.
# This feature is for internal testing and may change without notice.
coverage = ANSIBALLZ_COVERAGE_TEMPLATE % dict(
coverage_config=coverage_config,
coverage_output=os.environ['_ANSIBLE_COVERAGE_OUTPUT']
)
else:
coverage = ''
now = datetime.datetime.utcnow()
output.write(to_bytes(ACTIVE_ANSIBALLZ_TEMPLATE % dict(
zipdata=zipdata,
@ -819,6 +826,7 @@ def _find_module_utils(module_name, b_module_data, module_path, module_args, tas
hour=now.hour,
minute=now.minute,
second=now.second,
coverage=coverage,
)))
b_module_data = output.getvalue()