Express v4.x does not check to see if a Promise returned from a
middleware function will be rejected, so explicitly pass the Promise
rejection reason to `next()`.
We can revert this change after we upgrade to Express v5.0.
See https://expressjs.com/en/guide/error-handling.html for details.
Before, an unhandled rejection or uncaught exception during startup
would cause `exports.exit()` to wait forever for startup completion.
Similarly, an error during shutdown would cause `exports.exit()` to
wait forever for shutdown to complete. Now any error during startup or
shutdown triggers an immediate exit.
Logging verbosity of the openapi handlers was turned down so GitHub
should be happier with INFO now. This makes it easier to troubleshoot
problems.
This reverts commit b98aaf4904.
`waitForPromise()` should always be used with `await` (either directly
or with a later `await` on the returned Promise). In this case,
the condition should be immediately true so `waitForPromise()` is not
the right tool here.
* Delete the unused `optDoNow` parameter from `pendingInit()`.
* Move the `setAuthorInfo()` 1st parameter check out of the wrapper
and in to the `setAuthorInfo()` function itself.
All of the tests in this file are commented out so this file does
nothing. We can uncomment the code and clean it up, but the approach
taken in these tests will never work: For security reasons, browsers
do not allow synthetic key events to perform the default
behavior (such as moving the carent when an arrow key is pressed).
There are two ways to test responses to navigation keys:
* Use WebDriver to create "genuine" keyboard events.
* Suppress the default behavior and implement caret movement
ourselves. This is tremendously complicated, especially arrow
up/down.
Also add symlinks from the old `bin/` and `tests/` locations to avoid
breaking scripts and other tools.
Motivations:
* Scripts and tests no longer have to do dubious things like:
require('ep_etherpad-lite/node_modules/foo')
to access packages installed as dependencies in
`src/package.json`.
* Plugins can access the backend test helper library in a non-hacky
way:
require('ep_etherpad-lite/tests/backend/common')
* We can delete the top-level `package.json` without breaking our
ability to lint the files in `bin/` and `tests/`.
Deleting the top-level `package.json` has downsides: It will cause
`npm` to print warnings whenever plugins are installed, npm will
no longer be able to enforce a plugin's peer dependency on
ep_etherpad-lite, and npm will keep deleting the
`node_modules/ep_etherpad-lite` symlink that points to `../src`.
But there are significant upsides to deleting the top-level
`package.json`: It will drastically speed up plugin installation
because `npm` doesn't have to recursively walk the dependencies in
`src/package.json`. Also, deleting the top-level `package.json`
avoids npm's horrible dependency hoisting behavior (where it moves
stuff from `src/node_modules/` to the top-level `node_modules/`
directory). Dependency hoisting causes numerous mysterious
problems such as silent failures in `npm outdated` and `npm
update`. Dependency hoisting also breaks plugins that do:
require('ep_etherpad-lite/node_modules/foo')
For some reason strings are sometimes passed to `findUnmet()`, which
is obviously unexpected given the way the code is written. Rather than
figure out why strings are passed and how to safely avoid passing
strings, just return early. The net effect is the same, but returning
early avoids setting a property on a string, which is prohibited in
strict mode.
Benefits of `callHookFnSync()` and `callHookFnAsync()`:
* They are a lot more forgiving than `hookCallWrapper()` was.
* They perform useful sanity checks.
* They have extensive unit test coverage.
* They make the behavior of `callFirst()` and `aCallFirst()` match
the behavior of `callAll()` and `aCallAll()`.
Define states and use them to properly handle multiple calls to
`start()`, `stop()`, and `exit()`. (Multiple calls to `exit()` can
happen if there is an uncaught exception or signal during shutdown.)
This should also make it easier to add support for cleanly restarting
the server after a shutdown (for tests or via an `/admin` page).
* lint: skin-variants
* for squash: Fix attachment of event listener
Before this PR the statement was outside the function. I'm assuming
the move into the function body was accidental, so move it back out.
* for squash: Preserve order of function calls
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
There are some problems with nyc:
* The coverage numbers aren't useful in our case because most of the
code is executed outside the test process (the test code is mostly
API client logic).
* nyc messes with line numbers, which makes it much harder to debug
problems.
* We're seeing frequent SIGABRT crashes while nyc is printing the
results table. I'm not sure if nyc is the cause of the crashes, or
if it's making a race condition worse, or if the crashes have
nothing to do with nyc, but we don't lose much by removing it so
we might as well see if the crash frequency improves.
Before this change, the `author` attribute was silently discarded
during `.map()` iteration and the name of the attribute to remove was
included twice with two different values.
Before this commit, the callback passed to `.map()` during attribute
removal was a normal function, not an arrow function. This meant that
the value of `this` in the function body depended on how the callback
was invoked. In this case, the callback was invoked without any
explicit context (it was not called as a method, nor was it called via
`.call()`, `.apply()`, or `.bind()`). Without any explicit context,
the value of `this` depends on strict mode. Currently the function is
in sloppy mode, so `this` refers to the "global this" object (a.k.a.,
`window`). It doesn't make sense for the callback to reference
`window.author`, so I'm assuming the previous behavior was a bug.
Now the function is an arrow function, so the value of `this` comes
from the enclosing lexical context, which in this case is the
AttributeManager object. I believe that was the original intention.
This makes it possible for plugin backend tests to do
`require('ep_etherpad-lite/tests/backend/common')` to access the API
key (among other things).
Eventually we probably should reverse these (move `tests/` to
`src/tests/` and make `tests/` a symlink to `src/tests/`) and move
`bin/` to `src/bin/` so that we can avoid the top-level `package.json`
mess.
The `name` property is only available on cheerio's Element-like
objects; DOM Element objects do not have a `name` property. Switch to
`dom.tagName()` to fix the logic for browsers.
The `parent` property is only available on cheerio's Node-like
objects; DOM Node objects do not have a `parent` property. Switch to
the `parentNode` property so that the code works in browsers as well
as cheerio.
Before, the hook always ignored the return values provided by the hook
functions. Now the hook functions can change the text by either
returning a string or setting `context.text` to the desired value.
Also drop the `styl` and `cls` context properties. They were never
documented and they were always null.
In the DOM, `.children` only includes children that are Element
objects. In cheerio 0.22.0, `.children` includes all child Nodes, not
just Elements. Use `dom.numChildNodes()` and `dom.childNode()` so that
browsers behave the same as cheerio.
`for..in` iterates over inherited properties, which is almost never
desired. In most cases there aren't any inherited enumerable
properties so it's not that big of a deal, but in the case of
HTMLCollection it's very bad because it iterates over every entry
twice (once by numerical index and once by name) plus it includes the
`length` property in the iteration.
The `attribs` property is only available on cheerio's Element-like
objects; DOM Element objects do not have an `attribs` property. Switch
to `dom.nodeAttr()` to fix the logic for browsers.
Various tidy up and linting of contentcollector.js and domline.js.
3 Tests disabled which are not due to be covered.
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
* remove IE and add strict headers
* linting: kids are back, need to stop for today
* linting: farbtastic fix
* lint: more lint fixes
* more lint fixes
* linting: sub 100 errors
* comments where I need help
* ready to be helped :)
* small fixes
* fixes
* linting: all errors resolved
* linting: remove note to self
* fix as per nulli/wezz000li suggestion
* fix as per nulli/wezz000li suggestion
* resolve merge conflicts
* better use if to silence eslint
* Use `for..of` with `Object.keys` instead of `for..in`
* lint: move setSelection to before call
Co-authored-by: webzwo0i <webzwo0i@c3d2.de>
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
This will make the pages gracefully handle HTTP server restart events,
which happen whenever a plugin is installed or uninstalled via the
`/admin/plugins` page.
* lint: collab-client
* Undo incorrect lint fixes
These will be re-fixed in a future commit.
* Properly fix guard-for-in error
* Properly fix prefer-rest-params errors
* Move some code back to where it was
Moving the code makes it hard to review the diff.
* Delete DISCONNECT_REASON case
Someone reading the code won't understand what "used to handle
appLevelDisconnectReason" means until they dig through the Git
history. Given the server never sends messages of type
DISCONNECT_REASON anyway, just delete the case.
* Refine lint fixes
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
Squashed changes from rhansen@rhansen.org:
* Move code back to where it was. (It's easier to review changes
when the code isn't moved. This causes some no-use-before-define
warnings to reappear, but those are just warnings.)
* Move eslint-disable comment to same line
* Use `window.clientvars` to resolve no-global-assign
* Undo changes that aren't about fixing lint errors
* lint: pluginfw tsort.js
* Don't comment out the `console.log()` call
Disabling the log message is out of scope for the pull request.
* Put const and let on separate lines
* Convert `tsort` from function to arrow function
ESLint doesn't complain about this due to a bug in
prefer-arrow/prefer-arrow-functions rule:
https://github.com/TristonJ/eslint-plugin-prefer-arrow/issues/24
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
* fix accidental write to global variable
properly show pending tests
log test name in suite
better log output for received/expected strings
* cc tests: enable second nestedOL test
* ignore the head tag on import
Missing await in call to this._pad.getInternalRevisionAText(rev). Function returns a promise. This bug breaks the createDiffHTML API call (how I discovered it).
These characters are in the RFC3986 reserved set.
These characters are added to the set of characters that cannot be the
last character of a URL to avoid mislinkification.
It should be the client's responsibility to handle null name or color.
In the case of author names, passing null to the client allows users
to fill in the names of other users (via a suggestUserName
CLIENT_MESSAGE).
When a new client opens a socket.io connection and sends a
CLIENT_READY message, Etherpad sends the new client a bunch of
USER_NEWINFO messages, one per other user already connected to the
pad. When iterating over the other users, filter out those without an
author ID or missing from the global authors database.
Normally I would let `eslint --fix` do this for me, but there's a bug
that causes:
const x = function ()
{
// ...
};
to become:
const x = ()
=> {
// ...
};
which ESLint thinks is a syntax error. (It probably is; I don't know
enough about the automatic semicolon insertion rules to be confident.)
* caching_middleware: fix gzip compression not triggered
* packages: If a client sets `Accept-Encoding: gzip`, the responseCache will
include `Content-Encoding: gzip` in all future responses, even
if a subsequent request does not set `Accept-Encoding` or another client
requests the file without setting `Accept-Encoding`.
Fix that.
* caching_middleware: use `test` instead of `match`
* add tests
* make code easier to understand
* make the regex more clear
* Fix bad paren placement in `/javascript` handler
This fixes a bug introduced in commit
ed5a635f4c.
* add regression test for #4495
* Move `/javascript` test to `specialpages.js`
Co-authored-by: webzwo0i <webzwo0i@c3d2.de>
Some authentication plugins use the users defined in the `users`
object but ignore the `password` and `hash` properties.
This change deletes all of the filtering logic, including the logic
that filters out users that have both `password` and `hash` properties
defined. I could have kept that check, but decided to remove it
because:
* There's no harm in defining both `hash` and `password`.
* Allowing both makes it easier to transition from one scheme to
another.
* It's fewer lines of code to maintain.
If `settings.json` contains a user without a `password` property then
nobody should be able to log in as that user using the built-in HTTP
basic authentication. This is true both with and without this change,
but before this change it wasn't immediately obvious that a malicious
user couldn't use an empty or null password to log in as such a user.
This commit adds an explicit nullish check and some unit tests to
ensure that an empty or null password will not work if the `password`
property is null or undefined.
This makes it possible to disable `contentEditable` for certain
elements in some circumstances (e.g., on links so that users can click
on them normally).
if animationState evaluates to -1 or 0, it would end up in a conditional that assign its value to itself. Since this is redundant, it is better to remove this conditional, to avoid an extra check
Rewrite the `callAll` and `aCallAll` functions to support all
reasonable hook behaviors and to report errors for unreasonable
behaviors (e.g., calling the callback twice).
Now a hook function like the following works as expected when invoked
by `aCallAll`:
```
exports.myHookFn = (hookName, context, cb) => {
cb('some value');
return;
};
```
If a hook function neither calls the callback nor returns a
(non-undefined) value then there's no way for the hook system to know
if/when the hook function has finished.
* Use jQuery to build the message HTML so that special characters in
the error message, URL, etc. are properly escaped. This helps
avoid XSS vulnerabilities.
* Use bold text for the error message to make it stand out.
* Add a line break between the error message and "in <url> at line
<line>" so that the error message stands out more.
* Use `<p>...</p>` instead of `</br>` to separate the parts of the
popup.
* Use CSS for spacing instead of `</br>`.
* Grammar fixes (add a missing comma, "at" instead of "in").
Teach Gritter to accept anything that jQuery's `.append()` method
accepts for the title and text of a popup message. This makes it
easier to safely build HTML messages with proper escaping of special
characters (to prevent XSS vulnerabilities).
The debug statement mostly printed the following useless message over
and over, causing Travis CI logs to become truncated:
[DEBUG] pluginfw - [ undefined ] returning
This will be a breaking change for some people.
We removed all internal password control logic. If this affects you, you have two options:
1. Use a plugin for authentication and use session based pad access (recommended).
1. Use a plugin for password setting.
The reasoning for removing this feature is to reduce the overall security footprint of Etherpad. It is unnecessary and cumbersome to keep this feature and with the thousands of available authentication methods available in the world our focus should be on supporting those and allowing more granual access based on their implementations (instead of half assed baking our own).
We could instead await the results of the hook, but then all callers
and their callers recursively would have to be converted to async, and
that's a huge change.
This currently isn't absolutely necessary because all current callers
of `userCanModify` already check for a read-only pad ID themselves.
However:
* This adds defense in depth.
* This makes it possible to simply replace the import handler's
`allowAnyoneToImport` check with a call to `userCanModify`.
There's no need to perform an authentication check in the socket.io
middleware because `PadMessageHandler.handleMessage` calls
`SecurityMananger.checkAccess` and that now performs authentication
and authorization checks.
This change also improves the user experience: Before, access denials
caused socket.io error events in the client, which `pad.js` mostly
ignores (the user doesn't see anything). Now a deny message is sent
back to the client, which causes `pad.js` to display an obvious
permission denied message.
This also fixes a minor bug: `settings.loadTest` is supposed to bypass
authentication and authorization checks, but they weren't bypassed
because `SecurityManager.checkAccess` did not check
`settings.loadTest`.
Rather than reinvent the wheel, use a well-tested library to parse and
write cookies. This should also help prevent XSS vulnerabilities
because the library handles special characters such as semicolon.
* Use the cookie functions from `pad_utils.js`.
* Delete unused methods, variables, and parameters.
* Simplify the logic.
* Use an ES6 class instead of a weird literal thingy.
* Use `const` instead of `var`.
Previously Etherpad would not pass the correct client IP address through and this caused the rate limiter to limit users behind reverse proxies. This change allows Etherpad to use a client IP passed from a reverse proxy.
Note to devs: This header can be spoofed and spoofing the header could be used in an attack. To mitigate additional *steps should be taken by Etherpad site admins IE doing rate limiting at proxy.* This only really applies to large scale deployments but it's worth noting.
Not every string was localized:
* `/admin/plugins` has some CSS magic to draw the tables of plugins
differently on narrow (mobile) screens, and the l10n library we
use does not support that particular magic. The strings that were
not localized are "Name", "Description", "Version", and "Time".
These strings are only stuck in English when the page is viewed on
a narrow screen; normal desktop users will see translated strings.
The CSS magic ought to be replaced with something more robust
(lots of nested `div`s); those remaining strings can be localized
whenever that happens.
* Strings from external sources such as plugin descriptions, error
messages, and `settings.json` comments are not localized.
Before this change, the authorize hook was invoked twice: once before
authentication and again after (if settings.requireAuthorization is
true). Now pre-authentication authorization is instead handled by a
new preAuthorize hook, and the authorize hook is only invoked after
the user has authenticated.
Rationale: Without this change it is too easy to write an
authorization plugin that is too permissive. Specifically:
* If the plugin does not check the path for /admin then a non-admin
user might be able to access /admin pages.
* If the plugin assumes that the user has already been authenticated
by the time the authorize function is called then unauthenticated
users might be able to gain access to restricted resources.
This change also avoids calling the plugin's authorize function twice
per access, which makes it easier for plugin authors to write an
authorization plugin that is easy to understand.
This change may break existing authorization plugins: After this
change, the authorize hook will no longer be able to authorize
non-admin access to /admin pages. This is intentional. Access to admin
pages should instead be controlled via the `is_admin` user setting,
which can be set in the config file or by an authentication plugin.
Also:
* Add tests for the authenticate and authorize hooks.
* Disable the authentication failure delay when testing.
This loses some of the granularity of the default HTTP basic auth
(unknown username vs. bad password), but there is considerable value
in having logging that is consistent no matter what authentication
plugins are installed.
The export request hook wasn't testing if the pad's id was from a read-only
pad before validating with the pad manager.
This includes an extra step that makes the read-only id verification and also
avoids setting the original pad's id as the file's name.
Commit 0bb8d73ba2 fixed the author ID
that is saved in the socket.io sessioninfo when the client sends a
`CLIENT_READY` with `reconnect` set to true, so it is now safe to undo
the workaround from PR #3868.
Fixes#4331.
Benefits:
* More functions are now async which makes it possible for future
changes to use await in those functions.
* This will help keep the server from drowning in too many messages
if we ever add acknowledgements or if WebSocket backpressure ever
becomes reality.
* This might make tests less flaky because changes triggered by a
message will complete before the Promise resolves.
Before, the author ID was only saved in the session info during the
initial CLIENT_READY, not when the client sent a CLIENT_READY due to a
reconnect. This caused the handling of subsequent messages to use an
undefined author ID.
This makes it possible for reverse proxies to transform 403 errors
into something like "upgrade to a premium account to access this
pad".
Also add some webaccess tests.
Move the handleMessageSecurity and handleMessage hooks after the call
to securityManager.checkAccess.
Benefits:
* A handleMessage plugin can safely assume the message will be
handled unless the plugin itself drops the message, so it doesn't
need to repeat the access checks done by the `handleMessage`
function.
* This paves the way for a future enhancement: pass the author ID to
the hooks.
Note: The handleMessageSecurity hook is broken in several ways:
* The hook result is ignored for `CLIENT_READY` and `SWITCH_TO_PAD`
messages because the `handleClientReady` function overwrites the
hook result. This causes the client to receive client vars with
`readonly` set to true, which causes the client to display an
immutable pad even though the pad is technically writable.
* The formatting toolbar buttons are removed for read-only pads
before the handleMessageSecurity hook even runs.
* It is awkwardly named: Without reading the documentation, how is
one supposed to know that "handle message security" actually means
"grant one-time write access to a read-only pad"?
* It is called for every message even though calls after a
`CLIENT_READY` or `SWITCH_TO_PAD` are mostly pointless.
* Why would anyone want to grant write access when the user visits a
read-only pad URL? The user should just visit the writable pad URL
instead.
* Why would anyone want to grant write access that only lasts for a
single socket.io connection?
* There are better ways to temporarily grant write access (e.g., the
authorize hook).
* This hook is inviting bugs because it breaks a core assumption
about `/p/r.*` URLs.
I think the hook should be deprecated and eventually removed.
A session's sessioninfo could go away asynchronously due to a
disconnect. Grab a reference once and use it throughout the function
to avoid dereferencing a null sessioninfo object.
Where feasible I put the await at the end of the function to
minimize the impact on latency.
My motivation for this change: Eliminate a race condition in tests I
am writing.
* `src/node/server.js` can now be run as a script (for normal
operation) or imported as a module (for tests).
* Move shutdown actions to `src/node/server.js` to be close to the
startup actions.
* Put startup and shutdown in functions so that tests can call them.
* Use `await` instead of callbacks.
* Block until the HTTP server is listening to avoid races during
test startup.
* Add a new `shutdown` hook.
* Use the `shutdown` hook to:
* close the HTTP server
* call `end()` on the stats collection to cancel its timers
* call `terminate()` on the Threads.Pool to stop the workers
* Exit with exit code 0 (instead of 1) on SIGTERM.
* Export the HTTP server so that tests can get the HTTP server's
port via `server.address().port` when `settings.port` is 0.
Avoid dereferencing `DB.db` until it is used so that it is possible to
`require('SessionStore')` before calling `DB.init()`. (This is useful
when writing tests.)
Before this change, `promises.timesLimit()` created `concurrency - 1`
too many promises. The only users of this function use a concurrency
of 500, so this meant that 499 extra promises were created each time
it was used. The bug didn't affect correctness, but it did result in a
large number of unnecessary database operations whenever a pad was
deleted. This change fixes that bug.
Also:
* Convert the function to async and have it resolve after all of the
created promises are resolved.
* Reject concurrency of 0 (unless total is 0).
* Document the function.
* Add tests.
Until now, the "mobile layout" (with right toolbar on bottom of the screen) was displayed only when screen was smaller than 800px. It made the toolbar break for screen about 1000px when a lot of plugins are in the toolbar.
Now instead, we detect with javascript when the toolbar icons overflow the natural space available, and we switch in "mobile layout" in such case
New feature to copy a pad without copying entire history. This is useful to perform a low CPU intensive operation while still copying current pad state.
Before, a malicious user could bypass authorization restrictions
imposed by the authorize hook:
* Step 1: Fetch any resource that the malicious user is authorized to
access (e.g., static content).
* Step 2: Use the signed express_sid cookie generated in step 1 to
create a socket.io connection.
* Step 3: Perform the CLIENT_READY handshake for the desired pad.
* Step 4: Profit!
Now the authorization decision made by the authorize hook is
propagated to SecurityManager so that it can approve or reject
socket.io messages as appropriate.
This also sets up future support for per-user read-only and
modify-only (no create) authorization levels.
If mocha hangs after running the tests, hit Ctrl-C and wtfnode will
print open files, open sockets, running timers, and running intervals.
Adding an `after` function that closes/stops all of those things will
ensure that mocha exits when it finishes running the tests.
Authentication plugins almost always want to read and modify
`settings.users`. The settings can already be accessed in a few other
ways, but this is much more convenient.
The authorization logic determines whether the user has already
successfully authenticated by looking to see if `req.session.user`
exists. If an authentication plugin says that it successfully
authenticated the user but it did not create `req.session.user` then
authentication will re-run for every access, and authorization plugins
will be unable to determine whether the user has been authenticated.
Return a 500 internal server error to prevent these problems.
Before it only logged an error like this:
SyntaxError: Unexpected string in JSON at position XYZ
Now it also logs the filename, making it easier to figure out where
the bad data is:
failed to read file /path/to/etherpad-lite/src/locales/en.json: SyntaxError: Unexpected string in JSON at position XYZ