Commit graph

886 commits

Author SHA1 Message Date
jneem
42c84079c3
Add fuzzing (#15) 2024-10-19 03:22:23 +02:00
Ethiraric
3cade858fa Add test for issue 13, just in case. 2024-10-17 20:02:09 +02:00
Joe Neeman
982407774e Add regression test for infinite loop. 2024-10-17 19:36:20 +02:00
Ethiraric
26707be38d Move CI from subprojects to root. 2024-10-17 19:25:19 +02:00
Ethiraric
7fa54dcb24 Run issue tests through BufferedInput. 2024-10-17 19:18:45 +02:00
Ethiraric
55858d15d9 Add Parser::new_from_iter. 2024-10-17 19:17:26 +02:00
Ethiraric
53dc2b9aba Add a justfile with before_commit at the root. 2024-10-13 16:24:28 +02:00
Ethiraric
979f8eabf9 Fix rustdoc warnings.
This requires making the `input` mod public, as some items in
`char_traits` are referenced in `Input`'s function doccomments.
The items have been exposed in `input` directly, rather than in a
`saphyr::input::char_traits` submod.
2024-10-13 16:21:37 +02:00
Ethiraric
9e3317c179 Add tools to the workspace. 2024-10-13 16:21:03 +02:00
Ethiraric
95fe3fea16 Fix issue with --- in the middle of a scalar.
This issue arises in the `yaml-test-suite` not in a test, but rather in
the test description files themselves. Since we now use the library
itself to load the test files, this made the `yaml-test-suite` fail.
2024-10-13 16:19:58 +02:00
Ethiraric
dc429b7ef7 Deduplicate tools. 2024-10-13 16:18:44 +02:00
jneem
d82866555a Look ahead before testing for EOF. (#12)
Look ahead before testing for EOF.

This fixes panics in saphyr's test_multiline_trailing_newline and
test_multiline_leading_newline tests, in which `self.input.next_is_z`
would be called on an empty buffer and panic in `peek`
2024-10-13 14:48:06 +02:00
jneem
434f4521dd Better tracking for beginning and ending positions of mappings. (#10)
Previously, we often used the scanner state to infer the positions of
mappings. This is sometimes wrong, because the scanner has already
scanned ahead by the time the mapping is parsed.

This commit adds a check to the test suite, asserting that parser event
positions are all observed in order, and it fixes the scanner and parser
to make the new check pass.
2024-10-13 14:47:41 +02:00
Eduardo Sánchez Muñoz
833343757a Make StrInput type publicly visible
For cases where you might need to, e.g, define a function with a `Parser<StrInput<'_>>` as argument
2024-10-13 14:43:33 +02:00
Ethiraric
59048f68ae Code cleanup after monorepo-ing. 2024-10-13 14:42:50 +02:00
Ethiraric
8ee4921e5e Some cleanup after monorepo-ing.
- Update root `README.md`
- Remove `bench/` tools we no longer need
- Remove `.gitmodules` for `yaml-test-suite`
2024-10-12 16:29:33 +02:00
Ethiraric
3f9b8c22a3 Add yaml-test-suite as subtree. 2024-10-12 16:15:46 +02:00
Ethiraric
e303cbe543 Squashed 'parser/tests/yaml-test-suite/' content from commit ccfa74e5
git-subtree-dir: parser/tests/yaml-test-suite
git-subtree-split: ccfa74e56afb53da960847ff6e6976c0a0825709
2024-10-12 16:15:38 +02:00
David Aguilar
4e781f56c9 cargo: merge Cargo.toml files into a cargo workspaces 2024-10-04 00:35:06 -07:00
David Aguilar
3978720dc9 .gitignore: merge gitignores into a single top-level file 2024-10-04 00:34:35 -07:00
Ethiraric
57d2ff4b19 Convert to monorepo. 2024-10-03 13:55:58 +02:00
Ethiraric
5be327f855 Add changelog entry for last commit. 2024-09-25 16:32:56 +02:00
Ethiraric
fd5a606b19 Make LoadError Clone.
Fixes #11
2024-09-25 16:31:59 +02:00
jneem
d3b9641125 Remove bad assert. (#11)
We reserve bufmaxlen bytes in the string, but then proceed to push
up to bufmaxlen chars. Therefore it is possible that the string will
need to reallocate. A different solution would be to reserve
4*bufmaxlen bytes. Removing the assert is probably ok, because the
attempt to pre-allocate is just a performance optimization.
2024-09-13 23:17:23 +02:00
Eduardo Sánchez Muñoz
e215f546f3 Remove all unsafe code.
* Use `str::strip_prefix` to avoid using `str::from_utf8_unchecked`
* Avoid most uses of `extend_left` unsafe function
* Remove `Input::push_back` and remaining unsafe
2024-08-12 17:06:14 +02:00
Eduardo Sánchez Muñoz
6c57b5b5e4 Add "explicit" flag to Event::DocumentStart (#5)
Allows the event consumer to know whether the document explicitly starts with a `---`
2024-08-05 17:23:04 +02:00
jneem
926fdfb01b Use spans instead of markers (#3) 2024-08-05 17:08:23 +02:00
Ethiraric
4a5241e0bb Improve buffer handling in scan_plain_scalar. 2024-08-05 01:11:13 +02:00
Ethiraric
7275141203 Add run_bench to Cargo.toml. 2024-07-14 17:01:44 +02:00
Ethiraric
93b7e55bcf Move scanning low-level functions to Input. 2024-07-14 17:01:44 +02:00
Ethiraric
696ca59a16 Move next_can_be_plain_scalar to Input trait. 2024-07-14 17:01:44 +02:00
Ethiraric
8d7c3a1c1b Move skip_ws_to_eol to Input trait. 2024-07-14 17:01:44 +02:00
Ethiraric
db4f26da42 Add StrInput. 2024-07-14 17:01:44 +02:00
Ethiraric
0e9cee18f2 Move buffered_input to an input module. 2024-07-14 17:01:44 +02:00
Ethiraric
65fcb6fde3 Move next_can_be_plain_scalar as free fn.
This is, for some reason, a huge pessimization. `rustc` fails to
optimize it as well as it did when it was part of `Scanner`.

This is however kinda needed if I want to avoid having this code
duplicated in every implementation of the input.
2024-07-14 17:01:44 +02:00
Ethiraric
986c45a8b4 Add custom commands I don't want to forget. 2024-07-14 17:01:44 +02:00
Ethiraric
693cc19042 Avoid too many lookaheads in scan_plain_scalar. 2024-07-14 17:00:10 +02:00
Ethiraric
d27bae9fa5 Fix debug_prints in release mode. 2024-07-14 17:00:10 +02:00
Ethiraric
93a35ab6f7 Move document indicator detection to Input. 2024-07-14 17:00:10 +02:00
Ethiraric
afa1b2319f Remove 1 line wrappers. 2024-07-14 17:00:08 +02:00
Ethiraric
f8b6d849d3 Performance improvement.
The refactoring added `next_is` which takes a `&str` as parameter, while
we only use it with strings of lengths 2 and 3. Replacing this by 2
dedicated methods (which can be added to the trait interface and only
specialized if needed) removes almost all the overhead that was added by
`Input`.
2024-07-14 16:59:10 +02:00
Ethiraric
d9bb7a1693 Add Input interface.
Hiding character fetching behind this interface allows us to create more
specific implementations when is appropriate. For instance, an instance
of `Input` can be created for a `&str`, allowing for borrowing and more
efficient peeking and traversing than if we were to fetch characters one
at a time and placing them into a temporary buffer.
2024-07-14 16:59:09 +02:00
Ethiraric
11cffc6df8 Fix issue with deeply indented block scalars.
Fixes #2
2024-07-14 16:57:26 +02:00
Chris Gunn
1fc46923ef Fix multiline string emit.
Use `|-` instead of `|` when there is not a trailing newline
in the string value.
2024-07-09 16:27:58 +02:00
Ethiraric
d582b0fec9 Refactor to remove unnecessary unwrap. 2024-07-03 00:55:41 +02:00
Ethiraric
23c0b3c547 Move load_from_* functions in Yaml.
This would make more sense in user code:
```rs
Yaml::load_from_str("foo"); // Explicit that we're parsing YAML
load_from_str("foo"); // Too implicit, too generic, may be from another
                         lib
```

Plus, this mirrors `MarkedYaml`'s behavior.
2024-07-03 00:55:41 +02:00
Ethiraric
842d536cb0 Implement LoadableYamlNode for MarkedYaml.
A few changes have had to be made to `LoadableYamlNode`:
  * The `From<Yaml>` requirement has been removed as it can be
    error-prone. It was not a direct conversion as it is unable to
    handle `Yaml::Hash` or `Yaml::Array` with a non-empty array/map.
  * Instead, `from_bare_yaml` was added, which does essentially the same
    as `From` but does not leak for users of the library.
  * `with_marker` has been added to populate the marker for the `Node`.
    The function is empty for `Yaml`.

`load_from_*` methods have been added to `MarkedYaml` for convenience.
They load YAML using the markers.

The markers returned from `saphyr-parser` are not all correct, meaning
that tests are kind of useless for now as they will fail due to bugs
outside of the scope of this library.
2024-07-03 00:55:41 +02:00
Ethiraric
9ab8dd7c07 Update doccomments. 2024-07-03 00:55:41 +02:00
Ethiraric
d2caaf2ab3 Prepare the ground for annotated parsing.
* Make `YamlLoader` generic on the type of the `Node`. This is required
   because deeper node need to have annotations too.
 * Add a `LoadableYamlNode` trait, required for YAML node types to be
   loaded by `YamlLoader`. It contains methods required by `YamlLoader`
   during loading.
 * Implement `LoadableYamlNode` for `Yaml`.
 * Take `load_from_str` out of `YamlLoader` for parsing non-annotated
   nodes. This avoids every user to specify the generics in
   `YamlLoader::<Yaml>::load_from_str`.
2024-07-03 00:55:41 +02:00
Ethiraric
425f00ceb8 Add base support for annotated YAML objects. 2024-07-03 00:55:41 +02:00