Tuesday, May 21, 2019

Debugging a proxmox LXC container that will not start

Sometime after making changes in your LXC configuration file in /etc/pve/lxc your LXC container may have problem starting. You will get a message like this:

Job for pve-container@165.service failed because the control process exited with error code.
See "systemctl status pve-container@165.service" and "journalctl -xe" for details.
command 'systemctl start pve-container@165' failed: exit code 1

Then it recommends to get status of the start by typing...

systemctl status pve-container@165.service

And its output is...

● pve-container@165.service - PVE LXC Container: 165
   Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2019-05-21 08:13:01 CDT; 10s ago
     Docs: man:lxc-start
           man:lxc
           man:pct
  Process: 3560220 ExecStart=/usr/bin/lxc-start -n 165 (code=exited, status=1/FAILURE)

May 21 08:12:59 e2 systemd[1]: Starting PVE LXC Container: 165...
May 21 08:13:01 e2 lxc-start[3560220]: lxc-start: 165: lxccontainer.c: wait_on_daemonized_start: 865 Received container state "ABORTING" instead of "RUNNING"
May 21 08:13:01 e2 lxc-start[3560220]: lxc-start: 165: tools/lxc_start.c: main: 330 The container failed to start
May 21 08:13:01 e2 lxc-start[3560220]: lxc-start: 165: tools/lxc_start.c: main: 333 To get more details, run the container in foreground mode
May 21 08:13:01 e2 lxc-start[3560220]: lxc-start: 165: tools/lxc_start.c: main: 336 Additional information can be obtained by setting the --logfile and --logpriority options
May 21 08:13:01 e2 systemd[1]: pve-container@165.service: Control process exited, code=exited status=1
May 21 08:13:01 e2 systemd[1]: pve-container@165.service: Killing process 3560226 (3) with signal SIGKILL.
May 21 08:13:01 e2 systemd[1]: Failed to start PVE LXC Container: 165.
May 21 08:13:01 e2 systemd[1]: pve-container@165.service: Unit entered failed state.

May 21 08:13:01 e2 systemd[1]: pve-container@165.service: Failed with result 'exit-code'.


As you can see the output still does not tell me enough information.

So I recommend using the following command to start LXC with debug and output log to temporary file:


lxc-start --logfile /tmp/lxc-start.log -n [CTID]



if the debug or log above is still not enough, the next command below will provide an EXTREMELY detailed log:

strace -f lxc-start -l trace -o /tmp/trace.log -n [CTID]

3 comments:

  1. This might be an old thread but what I found with this issue was that the network device failed to be created. What I did to fix the issue was to go into the Proxmox GUI and navigate to the container in question's "Network settings". I then made a change to the gateway and made it something else and saved. Then edited the network interface again and changed it back to what it needed to be and saved. After that it started no issues. Hope this helps someone else.

    ReplyDelete
    Replies
    1. This comment saved me at least several hours looking for solution :)

      Delete
    2. Glad it helped you out.

      Delete