Previously every auto-serialized tags are strongly typed. However
this is not strictly required, and instead it can be exploited
to provide the optimal encoding for smaller integers. This commit
repurposes `EsI8`/`EsU8` through `EsI64`/`EsU64` tags to represent
*any* integers with given ranges: It is now possible to encode
`42u64` as two bytes `EsU8 0x2a`, for example.
There are some limitations:
* It does not apply to non-auto-serialized tags for obvious reasons.
Fortunately, we have already eliminated the biggest source of
such tag in favor of auto-serialized tags: `tag_table_id`.
* Bigger tags cannot be used to represent smaller types.
* Signed tags and unsigned tags do not mix.
We try to move the data when the length can be encoded in
the much smaller number of bytes. This interferes with indices and
type abbreviations however, so this commit introduces a public
interface to get and mark a "stable" (i.e. not affected by
relaxation) position of the current pointer.
The relaxation logic only moves a small data, currently at most
256 bytes, as moving the data can be costly. There might be
further opportunities to allow more relaxation by moving fields
around, which I didn't seriously try.
They replace the existing `EsEnumVid`, `EsVecLen` and `EsMapLen`
tags altogether; the meaning of them can be easily inferred
from the enclosing tag. It also has an added benefit of
encodings for smaller variant ids or lengths being more compact
(5 bytes to 2 bytes).
For the reference, while it is designed to be selectively enabled,
it was essentially enabled throughout every snapshot and nightly
as far as I can tell. This makes the usefulness of `EsLabel` itself
questionable, as it was quite rare that `EsLabel` broke the build.
It had consumed about 20~30% of metadata (!) and so this should be
a huge win.
It doesn't serve any useful purpose. It *might* be useful when
there are some tags that are generated by `Encodable` and
not delimited by any tags, but IIUC it's not the case.
Previous:
<-------------------- len1 ------------------->
EsEnum <len1> EsEnumVid <vid> EsEnumBody <len2> <arg1> <arg2>
<--- len2 -->
Now:
<----------- len1 ---------->
EsEnum <len1> EsEnumVid <vid> <arg1> <arg2>
Many auto-serialization tags are fixed-size (note: many ordinary
tags are also fixed-size but for now this commit ignores them),
so having an explicit length is a waste. This moves any
auto-serialization tags with an implicit length before other tags,
so a test for them is easy. A preliminary experiment shows this
has at least 1% gain over the status quo.
EBML tags are encoded in a variable-length unsigned int (vuint),
which is clever but causes some tags to be encoded in two bytes
while there are really about 180 tags or so. Assuming that there
wouldn't be, say, over 1,000 tags in the future, we can use much
more efficient encoding scheme. The new scheme should support
at most 4,096 tags anyway.
This also flattens a scattered tag namespace (did you know that
0xa9 is followed by 0xb0?) and makes a room for autoserialized tags
in 0x00 through 0x1f.
This changes the type of some public constants/statics in libunicode.
Notably some `&'static &'static [(char, char)]` have changed
to `&'static [(char, char)]`. The regexp crate seems to be the
sole user of these, yet this is technically a [breaking-change]
This commit renames the features for the `std::old_io` and `std::old_path`
modules to `old_io` and `old_path` to help facilitate migration to the new APIs.
This is a breaking change as crates which mention the old feature names now need
to be renamed to use the new feature names.
[breaking-change]
There are a number of holes that the stability lint did not previously cover,
including:
* Types
* Bounds on type parameters on functions and impls
* Where clauses
* Imports
* Patterns (structs and enums)
These holes have all been fixed by overriding the `visit_path` function on the
AST visitor instead of a few specialized cases. This change also necessitated a
few stability changes:
* The `collections::fmt` module is now stable (it was already supposed to be).
* The `thread_local:👿:Key` type is now stable (it was already supposed to
be).
* The `std::rt::{begin_unwind, begin_unwind_fmt}` functions are now stable.
These are required via the `panic!` macro.
* The `std::old_io::stdio::{println, println_args}` functions are now stable.
These are required by the `print!` and `println!` macros.
* The `ops::{FnOnce, FnMut, Fn}` traits are now `#[stable]`. This is required to
make bounds with these traits stable. Note that manual implementations of
these traits are still gated by default, this stability only allows bounds
such as `F: FnOnce()`.
Additionally, the compiler now has special logic to ignore its own generated
`__test` module for the `--test` harness in terms of stability.
Closes#8962Closes#16360Closes#20327
[breaking-change]
In preparation for upcoming changes to the `Writer` trait (soon to be called
`Write`) this commit renames the current `write` method to `write_all` to match
the semantics of the upcoming `write_all` method. The `write` method will be
repurposed to return a `usize` indicating how much data was written which
differs from the current `write` semantics. In order to head off as much
unintended breakage as possible, the method is being deprecated now in favor of
a new name.
[breaking-change]
This commit is an implementation of [RFC 565][rfc] which is a stabilization of
the `std::fmt` module and the implementations of various formatting traits.
Specifically, the following changes were performed:
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/0565-show-string-guidelines.md
* The `Show` trait is now deprecated, it was renamed to `Debug`
* The `String` trait is now deprecated, it was renamed to `Display`
* Many `Debug` and `Display` implementations were audited in accordance with the
RFC and audited implementations now have the `#[stable]` attribute
* Integers and floats no longer print a suffix
* Smart pointers no longer print details that they are a smart pointer
* Paths with `Debug` are now quoted and escape characters
* The `unwrap` methods on `Result` now require `Display` instead of `Debug`
* The `Error` trait no longer has a `detail` method and now requires that
`Display` must be implemented. With the loss of `String`, this has moved into
libcore.
* `impl<E: Error> FromError<E> for Box<Error>` now exists
* `derive(Show)` has been renamed to `derive(Debug)`. This is not currently
warned about due to warnings being emitted on stage1+
While backwards compatibility is attempted to be maintained with a blanket
implementation of `Display` for the old `String` trait (and the same for
`Show`/`Debug`) this is still a breaking change due to primitives no longer
implementing `String` as well as modifications such as `unwrap` and the `Error`
trait. Most code is fairly straightforward to update with a rename or tweaks of
method calls.
[breaking-change]
Closes#21436
This gets rid of the 'experimental' level, removes the non-staged_api
case (i.e. stability levels for out-of-tree crates), and lets the
staged_api attributes use 'unstable' and 'deprecated' lints.
This makes the transition period to the full feature staging design
a bit nicer.
This partially implements the feature staging described in the
[release channel RFC][rc]. It does not yet fully conform to the RFC as
written, but does accomplish its goals sufficiently for the 1.0 alpha
release.
It has three primary user-visible effects:
* On the nightly channel, use of unstable APIs generates a warning.
* On the beta channel, use of unstable APIs generates a warning.
* On the beta channel, use of feature gates generates a warning.
Code that does not trigger these warnings is considered 'stable',
modulo pre-1.0 bugs.
Disabling the warnings for unstable APIs continues to be done in the
existing (i.e. old) style, via `#[allow(...)]`, not that specified in
the RFC. I deem this marginally acceptable since any code that must do
this is not using the stable dialect of Rust.
Use of feature gates is itself gated with the new 'unstable_features'
lint, on nightly set to 'allow', and on beta 'warn'.
The attribute scheme used here corresponds to an older version of the
RFC, with the `#[staged_api]` crate attribute toggling the staging
behavior of the stability attributes, but the user impact is only
in-tree so I'm not concerned about having to make design changes later
(and I may ultimately prefer the scheme here after all, with the
`#[staged_api]` crate attribute).
Since the Rust codebase itself makes use of unstable features the
compiler and build system to a midly elaborate dance to allow it to
bootstrap while disobeying these lints (which would otherwise be
errors because Rust builds with `-D warnings`).
This patch includes one significant hack that causes a
regression. Because the `format_args!` macro emits calls to unstable
APIs it would trigger the lint. I added a hack to the lint to make it
not trigger, but this in turn causes arguments to `println!` not to be
checked for feature gates. I don't presently understand macro
expansion well enough to fix. This is bug #20661.
Closes#16678
[rc]: https://github.com/rust-lang/rfcs/blob/master/text/0507-release-channels.md
fmt::Show is for debugging, and can and should be implemented for
all public types. This trait is used with `{:?}` syntax. There still
exists #[derive(Show)].
fmt::String is for types that faithfully be represented as a String.
Because of this, there is no way to derive fmt::String, all
implementations must be purposeful. It is used by the default format
syntax, `{}`.
This will break most instances of `{}`, since that now requires the type
to impl fmt::String. In most cases, replacing `{}` with `{:?}` is the
correct fix. Types that were being printed specifically for users should
receive a fmt::String implementation to fix this.
Part of #20013
[breaking-change]
This commit moves the libserialize crate (and will force the hand of the
rustc-serialize crate) to not require the `old_orphan_check` feature gate as
well as using associated types wherever possible. Concretely, the following
changes were made:
* The error type of `Encoder` and `Decoder` is now an associated type, meaning
that these traits have no type parameters.
* The `Encoder` and `Decoder` type parameters on the `Encodable` and `Decodable`
traits have moved to the corresponding method of the trait. This movement
alleviates the dependency on `old_orphan_check` but implies that
implementations can no longer be specialized for the type of encoder/decoder
being implemented.
Due to the trait definitions changing, this is a:
[breaking-change]
This removes a large array of deprecated functionality, regardless of how
recently it was deprecated. The purpose of this commit is to clean out the
standard libraries and compiler for the upcoming alpha release.
Some notable compiler changes were to enable warnings for all now-deprecated
command line arguments (previously the deprecated versions were silently
accepted) as well as removing deriving(Zero) entirely (the trait was removed).
The distribution no longer contains the libtime or libregex_macros crates. Both
of these have been deprecated for some time and are available externally.
followed by a semicolon.
This allows code like `vec![1i, 2, 3].len();` to work.
This breaks code that uses macros as statements without putting
semicolons after them, such as:
fn main() {
...
assert!(a == b)
assert!(c == d)
println(...);
}
It also breaks code that uses macros as items without semicolons:
local_data_key!(foo)
fn main() {
println("hello world")
}
Add semicolons to fix this code. Those two examples can be fixed as
follows:
fn main() {
...
assert!(a == b);
assert!(c == d);
println(...);
}
local_data_key!(foo);
fn main() {
println("hello world")
}
RFC #378.
Closes#18635.
[breaking-change]
Relax some of the bounds on the decoder methods back to FnMut to help accomodate
some more flavorful variants of decoders which may need to run the closure more
than once when it, for example, attempts to find the first successful enum to
decode.
This a breaking change due to the bounds for the trait switching, and clients
will need to update from `FnOnce` to `FnMut` as well as likely making the local
function binding mutable in order to call the function.
[breaking-change]
This change makes the compiler no longer infer whether types (structures
and enumerations) implement the `Copy` trait (and thus are implicitly
copyable). Rather, you must implement `Copy` yourself via `impl Copy for
MyType {}`.
A new warning has been added, `missing_copy_implementations`, to warn
you if a non-generic public type has been added that could have
implemented `Copy` but didn't.
For convenience, you may *temporarily* opt out of this behavior by using
`#![feature(opt_out_copy)]`. Note though that this feature gate will never be
accepted and will be removed by the time that 1.0 is released, so you should
transition your code away from using it.
This breaks code like:
#[deriving(Show)]
struct Point2D {
x: int,
y: int,
}
fn main() {
let mypoint = Point2D {
x: 1,
y: 1,
};
let otherpoint = mypoint;
println!("{}{}", mypoint, otherpoint);
}
Change this code to:
#[deriving(Show)]
struct Point2D {
x: int,
y: int,
}
impl Copy for Point2D {}
fn main() {
let mypoint = Point2D {
x: 1,
y: 1,
};
let otherpoint = mypoint;
println!("{}{}", mypoint, otherpoint);
}
This is the backwards-incompatible part of #13231.
Part of RFC #3.
[breaking-change]
In regards to:
https://github.com/rust-lang/rust/issues/19253#issuecomment-64836729
This commit:
* Changes the #deriving code so that it generates code that utilizes fewer
reexports (in particur Option::* and Result::*), which is necessary to
remove those reexports in the future
* Changes other areas of the codebase so that fewer reexports are utilized
This breaks code that referred to variant names in the same namespace as
their enum. Reexport the variants in the old location or alter code to
refer to the new locations:
```
pub enum Foo {
A,
B
}
fn main() {
let a = A;
}
```
=>
```
pub use self::Foo::{A, B};
pub enum Foo {
A,
B
}
fn main() {
let a = A;
}
```
or
```
pub enum Foo {
A,
B
}
fn main() {
let a = Foo::A;
}
```
[breaking-change]
Currently `Decoder` implementations are not provided the tuple arity as
a parameter to `read_tuple`. This forces all encoder/decoder combos to
serialize the arity along with the elements. Tuple-arity is always known
statically at the decode site, because it is part of the type of the
tuple, so it could instead be provided as an argument to `read_tuple`,
as it is to `read_struct`.
The upside to this is that serialized tuples could become smaller in
encoder/decoder implementations which choose not to serialize type
(arity) information. For example, @TyOverby's
[binary-encode](https://github.com/TyOverby/binary-encode) format is
currently forced to serialize the tuple-arity along with every tuple,
despite the information being statically known at the decode site.
A downside to this change is that the tuple-arity of serialized tuples
can no longer be automatically checked during deserialization. However,
for formats which do serialize the tuple-arity, either explicitly (rbml)
or implicitly (json), this check can be added to the `read_tuple` method.
The signature of `Deserialize::read_tuple` and
`Deserialize::read_tuple_struct` are changed, and thus binary
backwards-compatibility is broken. This change does *not* force
serialization formats to change, and thus does not break decoding values
serialized prior to this change.
[breaking-change]
https://github.com/rust-lang/rfcs/pull/221
The current terminology of "task failure" often causes problems when
writing or speaking about code. You often want to talk about the
possibility of an operation that returns a Result "failing", but cannot
because of the ambiguity with task failure. Instead, you have to speak
of "the failing case" or "when the operation does not succeed" or other
circumlocutions.
Likewise, we use a "Failure" header in rustdoc to describe when
operations may fail the task, but it would often be helpful to separate
out a section describing the "Err-producing" case.
We have been steadily moving away from task failure and toward Result as
an error-handling mechanism, so we should optimize our terminology
accordingly: Result-producing functions should be easy to describe.
To update your code, rename any call to `fail!` to `panic!` instead.
Assuming you have not created your own macro named `panic!`, this
will work on UNIX based systems:
grep -lZR 'fail!' . | xargs -0 -l sed -i -e 's/fail!/panic!/g'
You can of course also do this by hand.
[breaking-change]
This unifies the `non_snake_case_functions` and `uppercase_variables` lints
into one lint, `non_snake_case`. It also now checks for non-snake-case modules.
This also extends the non-camel-case types lint to check type parameters, and
merges the `non_uppercase_pattern_statics` lint into the
`non_uppercase_statics` lint.
Because the `uppercase_variables` lint is now part of the `non_snake_case`
lint, all non-snake-case variables that start with lowercase characters (such
as `fooBar`) will now trigger the `non_snake_case` lint.
New code should be updated to use the new `non_snake_case` lint instead of the
previous `non_snake_case_functions` and `uppercase_variables` lints. All use of
the `non_uppercase_pattern_statics` should be replaced with the
`non_uppercase_statics` lint. Any code that previously contained non-snake-case
module or variable names should be updated to use snake case names or disable
the `non_snake_case` lint. Any code with non-camel-case type parameters should
be changed to use camel case or disable the `non_camel_case_types` lint.
[breaking-change]
A quick and dirty fix for #15036 until we get serious decoder reform.
Right now it is impossible for a Decodable to signal a decode error,
for example if it has only finitely many allowed values, is a string
which must be encoded a certain way, needs a valid checksum, etc. For
example in the libuuid implementation of Decodable an Option is
unwrapped, meaning that a decode of a malformed UUID will cause the
task to fail.
Since this adds a method to the `Decoder` trait, all users will need
to update their implementations to add it. The strategy used for the
current implementations for JSON and EBML is to add a new entry to
the error enum `ApplicationError(String)` which stores the string
provided to `.error()`.
[breaking-change]
Our implementation of ebml has diverged from the standard in order
to better serve the needs of the compiler, so it doesn't make much
sense to call what we have ebml anyore. Furthermore, our implementation
is pretty crufty, and should eventually be rewritten into a format
that better suits the needs of the compiler. This patch factors out
serialize::ebml into librbml, otherwise known as the Really Bad
Markup Language. This is a stopgap library that shouldn't be used
by end users, and will eventually be replaced by something better.
[breaking-change]