The lexer and json were using `transmute(-1): char` as a sentinel value for EOF, which is invalid since `char` is strictly a unicode codepoint.
Fixing this allows for range asserts on chars since they always lie between 0 and 0x10FFFF.
The transmute was unsound.
There are many instances of .unwrap_or('\x00') for "ignoring" EOF which
either do not make the situation worse than it was (well, actually make
it better, since it's easy to grep for places that don't handle EOF) or
can never ever be read.
Fixes#8971.
This has been a long time coming. Conditions in rust were initially envisioned
as being a good alternative to error code return pattern. The idea is that all
errors are fatal-by-default, and you can opt-in to handling the error by
registering an error handler.
While sounding nice, conditions ended up having some unforseen shortcomings:
* Actually handling an error has some very awkward syntax:
let mut result = None;
let mut answer = None;
io::io_error::cond.trap(|e| { result = Some(e) }).inside(|| {
answer = Some(some_io_operation());
});
match result {
Some(err) => { /* hit an I/O error */ }
None => {
let answer = answer.unwrap();
/* deal with the result of I/O */
}
}
This pattern can certainly use functions like io::result, but at its core
actually handling conditions is fairly difficult
* The "zero value" of a function is often confusing. One of the main ideas
behind using conditions was to change the signature of I/O functions. Instead
of read_be_u32() returning a result, it returned a u32. Errors were notified
via a condition, and if you caught the condition you understood that the "zero
value" returned is actually a garbage value. These zero values are often
difficult to understand, however.
One case of this is the read_bytes() function. The function takes an integer
length of the amount of bytes to read, and returns an array of that size. The
array may actually be shorter, however, if an error occurred.
Another case is fs::stat(). The theoretical "zero value" is a blank stat
struct, but it's a little awkward to create and return a zero'd out stat
struct on a call to stat().
In general, the return value of functions that can raise error are much more
natural when using a Result as opposed to an always-usable zero-value.
* Conditions impose a necessary runtime requirement on *all* I/O. In theory I/O
is as simple as calling read() and write(), but using conditions imposed the
restriction that a rust local task was required if you wanted to catch errors
with I/O. While certainly an surmountable difficulty, this was always a bit of
a thorn in the side of conditions.
* Functions raising conditions are not always clear that they are raising
conditions. This suffers a similar problem to exceptions where you don't
actually know whether a function raises a condition or not. The documentation
likely explains, but if someone retroactively adds a condition to a function
there's nothing forcing upstream users to acknowledge a new point of task
failure.
* Libaries using I/O are not guaranteed to correctly raise on conditions when an
error occurs. In developing various I/O libraries, it's much easier to just
return `None` from a read rather than raising an error. The silent contract of
"don't raise on EOF" was a little difficult to understand and threw a wrench
into the answer of the question "when do I raise a condition?"
Many of these difficulties can be overcome through documentation, examples, and
general practice. In the end, all of these difficulties added together ended up
being too overwhelming and improving various aspects didn't end up helping that
much.
A result-based I/O error handling strategy also has shortcomings, but the
cognitive burden is much smaller. The tooling necessary to make this strategy as
usable as conditions were is much smaller than the tooling necessary for
conditions.
Perhaps conditions may manifest themselves as a future entity, but for now
we're going to remove them from the standard library.
Closes#9795Closes#8968
This has been a long time coming. Conditions in rust were initially envisioned
as being a good alternative to error code return pattern. The idea is that all
errors are fatal-by-default, and you can opt-in to handling the error by
registering an error handler.
While sounding nice, conditions ended up having some unforseen shortcomings:
* Actually handling an error has some very awkward syntax:
let mut result = None;
let mut answer = None;
io::io_error::cond.trap(|e| { result = Some(e) }).inside(|| {
answer = Some(some_io_operation());
});
match result {
Some(err) => { /* hit an I/O error */ }
None => {
let answer = answer.unwrap();
/* deal with the result of I/O */
}
}
This pattern can certainly use functions like io::result, but at its core
actually handling conditions is fairly difficult
* The "zero value" of a function is often confusing. One of the main ideas
behind using conditions was to change the signature of I/O functions. Instead
of read_be_u32() returning a result, it returned a u32. Errors were notified
via a condition, and if you caught the condition you understood that the "zero
value" returned is actually a garbage value. These zero values are often
difficult to understand, however.
One case of this is the read_bytes() function. The function takes an integer
length of the amount of bytes to read, and returns an array of that size. The
array may actually be shorter, however, if an error occurred.
Another case is fs::stat(). The theoretical "zero value" is a blank stat
struct, but it's a little awkward to create and return a zero'd out stat
struct on a call to stat().
In general, the return value of functions that can raise error are much more
natural when using a Result as opposed to an always-usable zero-value.
* Conditions impose a necessary runtime requirement on *all* I/O. In theory I/O
is as simple as calling read() and write(), but using conditions imposed the
restriction that a rust local task was required if you wanted to catch errors
with I/O. While certainly an surmountable difficulty, this was always a bit of
a thorn in the side of conditions.
* Functions raising conditions are not always clear that they are raising
conditions. This suffers a similar problem to exceptions where you don't
actually know whether a function raises a condition or not. The documentation
likely explains, but if someone retroactively adds a condition to a function
there's nothing forcing upstream users to acknowledge a new point of task
failure.
* Libaries using I/O are not guaranteed to correctly raise on conditions when an
error occurs. In developing various I/O libraries, it's much easier to just
return `None` from a read rather than raising an error. The silent contract of
"don't raise on EOF" was a little difficult to understand and threw a wrench
into the answer of the question "when do I raise a condition?"
Many of these difficulties can be overcome through documentation, examples, and
general practice. In the end, all of these difficulties added together ended up
being too overwhelming and improving various aspects didn't end up helping that
much.
A result-based I/O error handling strategy also has shortcomings, but the
cognitive burden is much smaller. The tooling necessary to make this strategy as
usable as conditions were is much smaller than the tooling necessary for
conditions.
Perhaps conditions may manifest themselves as a future entity, but for now
we're going to remove them from the standard library.
Closes#9795Closes#8968
- `extra::json` didn't make the cut, because of `extra::json` required
dep on `extra::TreeMap`. If/when `extra::TreeMap` moves out of `extra`,
then `extra::json` could move into `serialize`
- `libextra`, `libsyntax` and `librustc` depend on the newly created
`libserialize`
- The extensions to various `extra` types like `DList`, `RingBuf`, `TreeMap`
and `TreeSet` for `Encodable`/`Decodable` were moved into the respective
modules in `extra`
- There is some trickery, evident in `src/libextra/lib.rs` where a stub
of `extra::serialize` is set up (in `src/libextra/serialize.rs`) for
use in the stage0 build, where the snapshot rustc is still making
deriving for `Encodable` and `Decodable` point at extra. Big props to
@huonw for help working out the re-export solution for this
extra: inline extra::serialize stub
fix stuff clobbered in rebase + don't reexport serialize::serialize
no more globs in libserialize
syntax: fix import of libserialize traits
librustc: fix bad imports in encoder/decoder
add serialize dep to librustdoc
fix failing run-pass tests w/ serialize dep
adjust uuid dep
more rebase de-clobbering for libserialize
fixing tests, pushing libextra dep into cfg(test)
fix doc code in extra::json
adjust index.md links to serialize and uuid library
This removes @[] from the parser as well as much of the handling of it (and `@str`) from the compiler as I can find.
I've just rebased @pcwalton's (already reviewed) `@str` removal (and fixed the problems in a separate commit); the only new work is the trailing commits with my authorship.
Closes#11967