codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ocata][Heat] Strange error returned after stack creation failure -r aw template with id xxx not found


On 21/07/20 8:03 pm, Laurent Dumont wrote:
> Hi!
> 
> We are currently troubleshooting a Heat stack issue where one of the 
> stack (one of 25 or so) is failing to be created properly (seemingly 
> randomly).
> 
> The actual error returned by Heat is quite strange and Google has been 
> quite sparse in terms of references.
> 
> The actual error looks like the following (I've sanitized some of the 
> names):
> 
> Resource CREATE failed: resources.potato: Resource CREATE failed: 
> resources[0]: raw template with id 22273 not found

When creating a nested stack, rather than just calling the RPC method to 
create a new stack, Heat stores the template in the database first and 
passes the ID in the RPC message.[1] (It turns out that by doing it this 
way we can save massive amounts of memory when processing a large tree 
of nested stacks.) My best guess is that this message indicates that the 
template row has been deleted by the time the other engine goes to look 
at it.

I don't see how you could have got an ID like 22273 without the template 
having been successfully stored at some point.

The template is only supposed to be deleted if the RPC call returns with 
an error.[2] The only way I can think of for that to happen before an 
attempt to create the child stack is if the RPC call times out, but the 
original message is eventually picked up by an engine. I would check 
your logs for RPC timeouts and consider increasing them.

What does the status_reason look like at one level above in the tree? 
That should indicate the first error that caused the template to be deleted.

>     heat resource-list STACK_NAME_HERE -n 50
>     +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>     | resource_name    | physical_resource_id                 |
>     resource_type           | resource_status | updated_time         |
>     stack_name                                                          
>          |
>     +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>     | potato              | RESOURCE_ID_HERE | OS::Heat::ResourceGroup |
>     CREATE_FAILED   | 2020-07-18 T19:52:10Z |
>     nested_stack_1_STACK_NAME_HERE                  |
>     | potato_server_group | RESOURCE_ID_HERE | OS::Nova::ServerGroup   |
>     CREATE_COMPLETE | 2020-07-21T19:52:10Z |
>     nested_stack_1_STACK_NAME_HERE                  |
>     | 0                |                                      |
>     potato1.yaml     | CREATE_FAILED   | 2020-07-18T19:52:12Z |
>     nested_stack_2_STACK_NAME_HERE |
>     | 1                |                                      |
>     potato1.yaml     | INIT_COMPLETE   | 2020-07- 18 T19:52:12Z |
>     nested_stack_2_STACK_NAME_HERE |
>     +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
> 
> 
> The template itself is pretty simple and attempts to create a 
> ServerGroup and 2 VMs (as part of the ResourceGroup). My feeling is that 
> one the creation of those machines fails and Heat get's a little cooky 
> and returns an error that might not be the actual root cause. I would 
> have expected the VM to show up in the resource list but I just see the 
> source "yaml".

It's clear from the above output that the scaled unit of the resource 
group is in fact a template (not an OS::Nova::Server), and the error is 
occurring trying to create a stack from that template (potato1.yaml) - 
before Heat even has a chance to start creating the server.

> Has anyone seen something similar in the past?

Nope.

cheers,
Zane.

[1] 
https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L367-L384
[2] 
https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L335-L342