Solvedjulia API consistency review

I'm starting this as a place to leave notes about things to make sure to consider when checking for API consistency in Julia 1.0.

  • Convention prioritization. Listing and prioritizing our what-comes-first conventions in terms of function arguments for do-blocks, IO arguments for functions that print, outputs for in-place functions, etc (#19150).

  • Positional vs keyword arguments. Long ago we didn't have keyword arguments. They're still sometimes avoided for performance considerations. We should make this choice based on what makes the best API, not on that kind of historical baggage (keyword performance issues should also be addressed so that this is no longer a consideration).

  • Metaprogramming tools. We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx. These should behave consistently: similar signatures, if there are functions with similar signatures, make sure they have similar macro versions. Ideally, they should all return values, rather than some returning values and others printing results, although that might be hard for things like LLVM code and assembly code.

  • IO <=> file name equivalence. We generally allow file names as strings to be passed in place of IO objects and the standard behavior is to open the file in the appropriate mode, pass the resulting IO object to the same function with the same arguments, and then ensure that the IO object is closed afterwards. Verify that all appropriate IO-accepting functions follow this pattern.

  • Reducers APIs. Make sure reducers have consistent behaviors – all take a map function before reduction; congruent dimension arguments, etc.

  • Dimension arguments. Consistent treatment of "calculate across this [these] dimension[s]" input arguments, what types are allowed etc, consider whether doing these as keyword args might be desired.

  • Mutating/non-mutating pairs. Check that non-mutating functions are paired with mutating functions where it makes sense and vice versa.

  • Tuple vs. vararg. Check that there is general consistency between whether functions take a tuple as the last argument or a vararg.

  • Unions vs. nullables vs. errors. Consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.).

  • Support generators as widely as possible. Make sure any function that could sensibly work with generators does so. We're pretty good about this already, but I'm guessing we've missed a few.

  • Output type selection. Be consistent about whether "output type" API's should be in terms of element type or overall container type (ref #11557 and #16740).

  • Pick a name. There are a few functions/operators with aliases. I think this is fine in cases where one of the names is non-ASCII and the ASCII version is provided so people can still write pure-ASCII code, but there are also cases like <: which is an alias for issubtype where both names are ASCII. We should pick one and deprecated the other. We deprecated is in favor of === and should do similarly here.

  • Consistency with DataStructures. It's somewhat beyond the scope of Base Julia, but we should make sure that all of collections in DataStructures have consistent APIs with those provided by Base. The connection in the other direction is that some of those types may inform how we end up designing the APIs in Base since we want them to extend smoothly and consistently.

  • NaNs vs. DomainErrors. See #5234 – have a policy for when to do which and make sure it is followed consistently.

  • Collection <=> generator. Sometimes you want a collection, sometimes you want a generator. We should go through all our APIs and make sure there's an option for both where it makes sense. Once upon a time, there was a convention to use an uppercase name for the generator version and a lowercase name for the version that's eager and returns a new collection. But no one ever paid any attention to that, so maybe we need a new convention.

  • Higher order functions on associatives. Currently some higher order functions iterate over associative collections with signature (k,v) – e.g. map, filter. Others iterate over pairs, i.e. with signature kv, requiring the body to explicitly destructure the pair into k and v – e.g. all, any. This should be reviewed and made consistent.

  • Convert vs. construct. Allow conversion where appropriate. E.g. there have been multiple issues/questions about convert(String, 'x'). In general, conversion is appropriate when there is a single canonical transformation. Conversion of strings into numbers in general isn't appropriate because there are many textual ways to represent numbers, so we need to parse instead, with options. There's a single canonical way to represent version numbers as strings, however, so we may convert those. We should apply this logic carefully and universally.

  • Review completeness of collections API. We should look at the standard library functions for collections provided by other languages and make sure we have a way of expressing the common operations they have. For example, we don't have a flatten function or a concat function. We probably should.

  • Underscore audit.

46 Answers

✔️Accepted Answer

Underscore audit

The following is an analysis of all symbols exported from Base which contain underscores, are not deprecated, and are not string macros. The main thing to note here is that these are exported names only; this does not include unexported names that we tell people to call qualified.

I've separated things out by category. Hopefully that's more useful than it is annoying.

Reflection

We have the following macros with corresponding functions:

  • @code_llvm, code_llvm
  • @code_lowered, code_lowered
  • @code_native, code_native
  • @code_typed, code_typed
  • @code_warntype, code_warntype

Whatever change is applied to the macros, if any, should be similarly applied to the functions.

  • module_name -> nameof (#25622)
  • module_parent -> parentmodule (#25629, see #25436 for a previous attempt at renaming)
  • method_exists -> hasmethod (#25615)
  • object_id -> objectid (#25615)
  • pointer_from_objref

pointer_from_objref could perhaps do with a more descriptive name, maybe something like address?

Aliases for C interop

The type aliases containing underscores are C_NULL, Cintmax_t, Cptrdiff_t, Csize_t, Cssize_t, Cuintmax_t, and Cwchar_t. Those that end in _t should stay, as they're named to be consistent with their corresponding C types.

C_NULL is the odd one out here, being the only C alias containing an underscore that isn't mirrored in C (since in C this is just NULL). We could consider calling this CNULL.

  • C_NULL

Bit counting

  • count_ones
  • count_zeros
  • trailing_ones
  • trailing_zeros
  • leading_ones
  • leading_zeros

For a discussion of renaming these, see #23531. I very much favor removing the underscores for these, as well as some of proposed replacements in that PR. I think it should be reconsidered.

Unsafe operations

  • unsafe_copyto!
  • unsafe_load
  • unsafe_pointer_to_objref
  • unsafe_read
  • unsafe_store!
  • unsafe_string
  • unsafe_trunc
  • unsafe_wrap
  • unsafe_write

It's probably okay to keep these as-is; the ugliness of the underscore further underscores their unsafety.

Indexing

  • broadcast_getindex
  • broadcast_setindex!
  • to_indices

Apparently broadcast_getindex and broadcast_setindex! exist. I don't understand what they do. Perhaps they could use a more descriptive name?

Interestingly, the single index version of to_indices, Base.to_index, is not exported.

Traces

  • catch_backtrace
  • catch_stacktrace -> stacktrace(catch_backtrace()) (#25615)

Presumably these are the catch block equivalents of backtrace and stacktrace, respectively.

Tasks, processes, and signals

  • current_task
  • task_local_storage
  • disable_sigint
  • reenable_sigint
  • process_exited
  • process_running

Streams

  • redirect_stderr
  • redirect_stdin
  • redirect_stdout
  • nb_available -> bytesavailable (#25634)

It would be nice to have a more general IO -> IO redirection function into which all of these could be combined, e.g. redirect(STDOUT, io), thereby removing both underscores and exports.

Promotion

  • promote_rule
  • promote_shape
  • promote_type

See #23999 for a relevant discussion regarding promote_rule.

Printing

  • print_with_color -> printstyled (see #25522)
  • print_shortest (see #25745)
  • escape_string (see #25620)
  • unescape_string

escape_string and unescape_string are a little odd in that they can print to a stream or return a string. See #25620 for a proposal to move/rename these.

Code loading

  • include_dependency
  • include_string

include_dependency. Is this even used outside of Base? I can't think of a situation where you would want this instead of include in any typical scenario.

include_string. Isn't this just an officially sanctioned version of eval(parse())?

Things I didn't bother categorizing

  • gc_enable -> GC.enable (#25616)
  • get_zero_subnormals
  • set_zero_subnormals
  • time_ns

get_zero_subnormals and set_zero_subnormals could do with more descriptive names. Do they need to be exported?

Other Answers:

No, this is a good place for that. And yes, we should strive to eliminate all names where underscores are necessary :)

I'll add renaming @inferred to @test_inferred.

Apologies if this isn't the appropriate place to mention this, but it would be nice to be more consistent with underscores in function names going forward.

@dpsanders: and exported macros! e.g. @fastmath has no docstring.

More Issues: