Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul error handling in the core framework #633

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Conversation

fgvanzee
Copy link
Member

Going forward, this PR branch will implement the changes set forth in #479.

To start out with, only the new configure options are implemented. I'll incrementally update parts of the framework to (a) observe the new error checking flags and respond accordingly, and (b) return err_t values up the call stack wherever possible.

Thanks for your patience on this project, @jdiamondGitHub.

Details:
- This commit makes changes pursuant to #479 that provides additional
  configure-time options for determining whether and how BLIS behaves
  when errors are detected at runtime. The options are presently not
  yet honored anywhere within the framework. However, this will change
  in the future.
@fgvanzee fgvanzee self-assigned this May 12, 2022
Details:
- Fully updated frame/0 in accordance to new error-handling APIs and
  error-checking policies. This includes:
  a. all functions that might possibly generate an error now return a
     value of type err_t.
  b. any such function in (a) that is called will have its return value
     captured and inspected for further return.
- Updated about half of the files within frame/base for err_t handling.
- Partially updated frame/thread, as necessary to given the updated
  err_t return values for other code included in this commit. A key
  omission was the thread decorators, which do not yet handle err_t
  values.
- Added a new file, bli_error_macro_defs.h, of error-related macros.
  These macros conveniently capture the logic that should be executed
  in several common situations, including checking a locally-determined
  error code for failure (and acting accordingly) as well as checking
  whether the err_t return value from a recently-called function needs
  to be returned up the function stack (in leiu of completing execution
  of the current function normally). Not all of these functions are
  used in the changes introduced in this commit, but they represent
  most situations that I foresee needing going forward.
- Re-indexed the err_t enum values so that BLIS_SUCCESS is assigned 0
  and BLIS_FAILURE (that is, generic failure) is assigned -1, instead
  of -1 and -2, respectively. This beings the BLIS error code behavior
  into closer conformality with many other C and Linux functions and
  tools.
- Defined a new errmode_t enum with two values -- BLIS_ERROR_RETURN and
  BLIS_ERROR_ABORT.
- Defined new static inline functions in bli_param_macro_defs.h for
  distinguishing ind_t values (e.g. BLIS_NAT and BLIS_1M). Did the same
  for dir_t values (e.g. BLIS_FWD and BLIS_BWD).
- Replaced all instances in BLIS of

    if ( rntm == NULL )
             { bli_rntm_init_from_global( &rntm_l ); rntm = &rntm_l; }
    else     { rntm_l = *rntm;                       rntm = &rntm_l; }

  with a call to a new static inline function that offers identical
  functionality:

    bli_rntm_init_if_null( &rntm, &rntm_l );

- Replaced all instances in BLIS of

    if ( cntx == NULL ) cntx = bli_gks_query_cntx();

  with a call to a new static inline function that offers identical
  functionality:

    bli_gks_query_cntx_if_null( ( const cntx_t** )&cntx );

- Moved frame/base/cast/bli_castnzm.c and .h to an 'old' sub-directory.
- Removed 'restrict' qualifier from cntx_t* argument to scalv and axpyf
  kernels in 'zen' kernel set.
- Updated hardware auto-detection code to reflect updated function
  signature to bli_arch_string().
- Updated the output of 'configure --help' to correctly indicate that
  error checking is enabled by default.
- Updated testsuite source files to conform to above changes.
- Updated documentation to reflect updated function signatures,
  including removal of 'restrict' qualifier from cntx_t* and auxinfo_t*
  arguments to various kernels APIs.
Details:
- Fixed a couple of compilation errors plus one logical error due to
  variable shadowing.
Details:
- Promote bli_pba_rntm_set_pba() to a full function, rather than a
  static inline function, so that it can be exported for shared
  libraries. This seems better than the alternative of exporting the
  function bli_pba_query(), which, as far as I can see, end users should
  not need access to.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant