Allowing disabling of 'graceful startup'

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Allowing disabling of 'graceful startup'

Brian Stansberry
tl;dr question is how to disable 'graceful startup'. Skip the background if you know what that means. :)

Background


Back in 2016 when we added the feature to allow a server to be started in 'suspended' state[1], that work also included a fix for the longstanding bug whereby during server start endpoints would be started and accepting external requests before all the services (e.g. from deployments) would be started. The result would be requests could reach the still-starting server and would fail, e.g. HTTP requests might get a 404 or some variety of 500.

I refer to this bug fix as 'graceful startup'.

Since the fix was introduced we've gotten quite a number of requests to be able to turn off that bug fix, e.g. WFCORE-4291.[2] The scenario is users deploy two apps, where app A during start makes an *external* request to app B and won't complete start until that request is handled. And, the users deploy both A and B in the same server. The server won't allow the external request during boot, so A won't complete start and thus the overall server start hangs until timeout.

I consider this kind of deployment pattern to be a bit of an anti-pattern, but we've gotten enough request to allow it that I'm looking into how to satisfy it. Also, at least for HTTP requests, mod_cluster can be used to prevent external requests reaching a server before things are ready, so if the 'internal' requests were not sent through the LB there's at least one 'error free' use case for this.


The Question

Question is whether to 

a) have an overall config switch to disable graceful startup across the board (e.g. a new value for the --start-mode cmd line param passed to standalone.sh)

b) have a subsystem specific setting in the undertow subsystem that configures undertow to allow requests in during boot.

Pros of a)

* Other request patterns are also handled. For example, if our app A was making a remote EJB call to app B, then an undertow only setting won't handle it. If we start adding multiple per-subsystem flags it gets ugly.
* Requests to web applications may still fail, as there are other aspects of the server that are rejecting certain calls until 'graceful startup' is complete. For example ee-concurrency rejects adding scheduled tasks (although that is somewhat a bug[3]), and the XTS integration looks to be designed to reject certain requests.[4] There may be others. If we have make web requests an exceptional pattern, going forward we have to account for that pattern in everything.
* The undertow subsystem itself has two different mechanisms for rejecting requests, with three different call patterns, all of which would need to be adapted.

Pros of b)

* It limits the change to the HTTP use case, the one where we know mod_cluster can be used to prevent external requests.
* I'm not sure about the batch subsystem; i.e. whether it is ok to have batch jobs starting before server start is complete. If the relevant services all have MSC dependencies on everthing they need it should be ok. If not there needs to be some adaptation listen for when the server is fully started, which seems doable.
* There may be code that is using this 'graceful startup' as a way not to prevent end user activity, but to prevent premature internal server activity. I think RecoverySuspendController may be an example of this; i.e. preventing start of the tx recovery thread until the server is started. But for this kind of thing there are other, better solutions.


Right now my preference is a), a global switch. If we're doing this I'm not inclined to limit it to HTTP only as I expect we'll just have to revisit it later. And I think I know how to deal with the more technical pros of the http-only approach.

WDYT?

Best regards,
Brian


_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing disabling of 'graceful startup'

Ingo Weiss
Hi Brian, thanks for looking into this.

On 2019-07-29 12:17:36-0500, Brian Stansberry wrote:
> The Question
>
> Question is whether to
>
> a) have an overall config switch to disable graceful startup across the
> board (e.g. a new value for the --start-mode cmd line param passed to
> standalone.sh)
 
I think this the better solution based on your pros. Having this
limited to only HTTP(S) requests makes it very limiting and ends up
not being sufficient in some cases, as you described.

Do you think it would be possible to make this configurable per
subsystem as well?

For some subsystems, like Undertow and EJB, you may want to use as
soon as they become available to reach out other systems or even call
a servlet on another deployment that has already started, this a case
I've seen before, while others, like Messaging, you may want to wait
for other subsystem, like JCA, to come up first. Does it make sense?

> Best regards,
> Brian

Best regards
--
Ingo Weiss

_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Allowing disabling of 'graceful startup'

James Perkins
In reply to this post by Brian Stansberry
I think A is likely the simplest option and would "revert" to the behavior users are likely looking for. However I do think B is an interesting idea to be able to opt subsystems unto graceful startup.

On Mon, Jul 29, 2019 at 11:16 AM Brian Stansberry <[hidden email]> wrote:
tl;dr question is how to disable 'graceful startup'. Skip the background if you know what that means. :)

Background


Back in 2016 when we added the feature to allow a server to be started in 'suspended' state[1], that work also included a fix for the longstanding bug whereby during server start endpoints would be started and accepting external requests before all the services (e.g. from deployments) would be started. The result would be requests could reach the still-starting server and would fail, e.g. HTTP requests might get a 404 or some variety of 500.

I refer to this bug fix as 'graceful startup'.

Since the fix was introduced we've gotten quite a number of requests to be able to turn off that bug fix, e.g. WFCORE-4291.[2] The scenario is users deploy two apps, where app A during start makes an *external* request to app B and won't complete start until that request is handled. And, the users deploy both A and B in the same server. The server won't allow the external request during boot, so A won't complete start and thus the overall server start hangs until timeout.

I consider this kind of deployment pattern to be a bit of an anti-pattern, but we've gotten enough request to allow it that I'm looking into how to satisfy it. Also, at least for HTTP requests, mod_cluster can be used to prevent external requests reaching a server before things are ready, so if the 'internal' requests were not sent through the LB there's at least one 'error free' use case for this.


The Question

Question is whether to 

a) have an overall config switch to disable graceful startup across the board (e.g. a new value for the --start-mode cmd line param passed to standalone.sh)

b) have a subsystem specific setting in the undertow subsystem that configures undertow to allow requests in during boot.

Pros of a)

* Other request patterns are also handled. For example, if our app A was making a remote EJB call to app B, then an undertow only setting won't handle it. If we start adding multiple per-subsystem flags it gets ugly.
* Requests to web applications may still fail, as there are other aspects of the server that are rejecting certain calls until 'graceful startup' is complete. For example ee-concurrency rejects adding scheduled tasks (although that is somewhat a bug[3]), and the XTS integration looks to be designed to reject certain requests.[4] There may be others. If we have make web requests an exceptional pattern, going forward we have to account for that pattern in everything.
* The undertow subsystem itself has two different mechanisms for rejecting requests, with three different call patterns, all of which would need to be adapted.

Pros of b)

* It limits the change to the HTTP use case, the one where we know mod_cluster can be used to prevent external requests.
* I'm not sure about the batch subsystem; i.e. whether it is ok to have batch jobs starting before server start is complete. If the relevant services all have MSC dependencies on everthing they need it should be ok. If not there needs to be some adaptation listen for when the server is fully started, which seems doable.

Batch jobs require some other component to start them. For example an EJB, servlet, etc. The one exception would be on a reload where the subsystem itself may restart jobs that were previously running before the reload.
 
* There may be code that is using this 'graceful startup' as a way not to prevent end user activity, but to prevent premature internal server activity. I think RecoverySuspendController may be an example of this; i.e. preventing start of the tx recovery thread until the server is started. But for this kind of thing there are other, better solutions.


Right now my preference is a), a global switch. If we're doing this I'm not inclined to limit it to HTTP only as I expect we'll just have to revisit it later. And I think I know how to deal with the more technical pros of the http-only approach.

WDYT?

Best regards,
Brian

_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev


--
James R. Perkins
JBoss by Red Hat

_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing disabling of 'graceful startup'

Brian Stansberry
In reply to this post by Ingo Weiss


On Tue, Jul 30, 2019 at 12:59 AM Ingo Weiss <[hidden email]> wrote:
Hi Brian, thanks for looking into this.

On 2019-07-29 12:17:36-0500, Brian Stansberry wrote:
> The Question
>
> Question is whether to
>
> a) have an overall config switch to disable graceful startup across the
> board (e.g. a new value for the --start-mode cmd line param passed to
> standalone.sh)

I think this the better solution based on your pros. Having this
limited to only HTTP(S) requests makes it very limiting and ends up
not being sufficient in some cases, as you described.

Do you think it would be possible to make this configurable per
subsystem as well?

For some subsystems, like Undertow and EJB, you may want to use as
soon as they become available to reach out other systems or even call
a servlet on another deployment that has already started, this a case
I've seen before, while others, like Messaging, you may want to wait
for other subsystem, like JCA, to come up first. Does it make sense?

If I understand you correctly, instead of my a) a global flag, or my b) an undertow flag, there would be several b)s. One to tell undertow to let requests through, one to tell EJB to let requests through,, one to tell messaging to let requests through (although that one's theoretical as messaging doesn't have graceful startup/shutdown anyway.)  Probably one for every subsystem that does anything related to graceful. The user then toggles the ones they want for their app. They'd have to know which they want.

That would be a quite big increase in scope. 

Not sure it's worth it, but it's something to think about while I'm on PTO. :)


_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing disabling of 'graceful startup'

Ingo Weiss
On 2019-07-30 17:40:52-0500, Brian Stansberry wrote:

> On Tue, Jul 30, 2019 at 12:59 AM Ingo Weiss <[hidden email]> wrote:
>
> > Hi Brian, thanks for looking into this.
> >
> > On 2019-07-29 12:17:36-0500, Brian Stansberry wrote:
> > > The Question
> > >
> > > Question is whether to
> > >
> > > a) have an overall config switch to disable graceful startup across the
> > > board (e.g. a new value for the --start-mode cmd line param passed to
> > > standalone.sh)
> >
> > I think this the better solution based on your pros. Having this
> > limited to only HTTP(S) requests makes it very limiting and ends up
> > not being sufficient in some cases, as you described.
> >
> > Do you think it would be possible to make this configurable per
> > subsystem as well?
> >
> > For some subsystems, like Undertow and EJB, you may want to use as
> > soon as they become available to reach out other systems or even call
> > a servlet on another deployment that has already started, this a case
> > I've seen before, while others, like Messaging, you may want to wait
> > for other subsystem, like JCA, to come up first. Does it make sense?
> >
>
> If I understand you correctly, instead of my a) a global flag, or my b) an
> undertow flag, there would be several b)s. One to tell undertow to let
> requests through, one to tell EJB to let requests through,, one to tell
> messaging to let requests through (although that one's theoretical as
> messaging doesn't have graceful startup/shutdown anyway.)  Probably one for
> every subsystem that does anything related to graceful. The user then
> toggles the ones they want for their app. They'd have to know which they
> want.
 
That's what I was thinking.

> That would be a quite big increase in scope.

Yeah, it was an increase in scope indeed, but I think it might end up
being a better fit for users.

It could surely be a multiple-phase approach. We start with a, see how
it goes, then move to b * no_of_possible_subsystems.

> Not sure it's worth it, but it's something to think about while I'm on PTO.
> :)

Enjoy and don't think about this. That's bad for you :)
--
Ingo Weiss

_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Allowing disabling of 'graceful startup'

Brian Stansberry


On Wed, Jul 31, 2019 at 8:40 AM Ingo Weiss <[hidden email]> wrote:
On 2019-07-30 17:40:52-0500, Brian Stansberry wrote:
> On Tue, Jul 30, 2019 at 12:59 AM Ingo Weiss <[hidden email]> wrote:
>
> > Hi Brian, thanks for looking into this.
> >
> > On 2019-07-29 12:17:36-0500, Brian Stansberry wrote:
> > > The Question
> > >
> > > Question is whether to
> > >
> > > a) have an overall config switch to disable graceful startup across the
> > > board (e.g. a new value for the --start-mode cmd line param passed to
> > > standalone.sh)
> >
> > I think this the better solution based on your pros. Having this
> > limited to only HTTP(S) requests makes it very limiting and ends up
> > not being sufficient in some cases, as you described.
> >
> > Do you think it would be possible to make this configurable per
> > subsystem as well?
> >
> > For some subsystems, like Undertow and EJB, you may want to use as
> > soon as they become available to reach out other systems or even call
> > a servlet on another deployment that has already started, this a case
> > I've seen before, while others, like Messaging, you may want to wait
> > for other subsystem, like JCA, to come up first. Does it make sense?
> >
>
> If I understand you correctly, instead of my a) a global flag, or my b) an
> undertow flag, there would be several b)s. One to tell undertow to let
> requests through, one to tell EJB to let requests through,, one to tell
> messaging to let requests through (although that one's theoretical as
> messaging doesn't have graceful startup/shutdown anyway.)  Probably one for
> every subsystem that does anything related to graceful. The user then
> toggles the ones they want for their app. They'd have to know which they
> want.

That's what I was thinking.

> That would be a quite big increase in scope.

Yeah, it was an increase in scope indeed, but I think it might end up
being a better fit for users.

It could surely be a multiple-phase approach. We start with a, see how
it goes, then move to b * no_of_possible_subsystems.

That's true; a global switch doesn't preclude something more fine-grained in the future.

> Not sure it's worth it, but it's something to think about while I'm on PTO.
> :)

Enjoy and don't think about this. That's bad for you :)

Thanks. Probably won't think too much. :) But this is a not small or trivial thing, nor something huge like dealing with javax.* being renamed to jakarta.* so it's the kind of thing I sometimes find relaxing to think about when I'm chilling out.
--
Ingo Weiss


--
Brian Stansberry
Manager, Senior Principal Software Engineer
Red Hat

_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing disabling of 'graceful startup'

Scott Marlow
In reply to this post by Brian Stansberry


On 7/29/19 1:17 PM, Brian Stansberry wrote:

> tl;dr question is how to disable 'graceful startup'. Skip the background
> if you know what that means. :)
>
> Background
>
>
> Back in 2016 when we added the feature to allow a server to be started
> in 'suspended' state[1], that work also included a fix for the
> longstanding bug whereby during server start endpoints would be started
> and accepting external requests before all the services (e.g. from
> deployments) would be started. The result would be requests could reach
> the still-starting server and would fail, e.g. HTTP requests might get a
> 404 or some variety of 500.
>
> I refer to this bug fix as 'graceful startup'.
>
> Since the fix was introduced we've gotten quite a number of requests to
> be able to turn off that bug fix, e.g. WFCORE-4291.[2] The scenario is
> users deploy two apps, where app A during start makes an *external*
> request to app B and won't complete start until that request is handled.
> And, the users deploy both A and B in the same server. The server won't
> allow the external request during boot, so A won't complete start and
> thus the overall server start hangs until timeout.
>
> I consider this kind of deployment pattern to be a bit of an
> anti-pattern, but we've gotten enough request to allow it that I'm
> looking into how to satisfy it. Also, at least for HTTP requests,
> mod_cluster can be used to prevent external requests reaching a server
> before things are ready, so if the 'internal' requests were not sent
> through the LB there's at least one 'error free' use case for this.
>
>
> The Question
>
> Question is whether to
>
> a) have an overall config switch to disable graceful startup across the
> board (e.g. a new value for the --start-mode cmd line param passed to
> standalone.sh)
>
> b) have a subsystem specific setting in the undertow subsystem that
> configures undertow to allow requests in during boot.

Would either of the following cover more cases?

c) have an overall config switch to specify a list of applications that
should allow requests during boot, which would be used to specify a set
of applications that need to be available, in order for the graceful
starting up applications to subsequently become available.

d) Similar to (c), except allow applications to specify other
applications that must be fully started, before them (e.g. A depends on
B, so B is first allowed to fully start in the first graceful completion
pass).

Scott
_______________________________________________
wildfly-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/wildfly-dev