The `name` property is only available on cheerio's Element-like
objects; DOM Element objects do not have a `name` property. Switch to
`dom.tagName()` to fix the logic for browsers.
The `parent` property is only available on cheerio's Node-like
objects; DOM Node objects do not have a `parent` property. Switch to
the `parentNode` property so that the code works in browsers as well
as cheerio.
Before, the hook always ignored the return values provided by the hook
functions. Now the hook functions can change the text by either
returning a string or setting `context.text` to the desired value.
Also drop the `styl` and `cls` context properties. They were never
documented and they were always null.
In the DOM, `.children` only includes children that are Element
objects. In cheerio 0.22.0, `.children` includes all child Nodes, not
just Elements. Use `dom.numChildNodes()` and `dom.childNode()` so that
browsers behave the same as cheerio.
`for..in` iterates over inherited properties, which is almost never
desired. In most cases there aren't any inherited enumerable
properties so it's not that big of a deal, but in the case of
HTMLCollection it's very bad because it iterates over every entry
twice (once by numerical index and once by name) plus it includes the
`length` property in the iteration.
The `attribs` property is only available on cheerio's Element-like
objects; DOM Element objects do not have an `attribs` property. Switch
to `dom.nodeAttr()` to fix the logic for browsers.
Various tidy up and linting of contentcollector.js and domline.js.
3 Tests disabled which are not due to be covered.
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
* remove IE and add strict headers
* linting: kids are back, need to stop for today
* linting: farbtastic fix
* lint: more lint fixes
* more lint fixes
* linting: sub 100 errors
* comments where I need help
* ready to be helped :)
* small fixes
* fixes
* linting: all errors resolved
* linting: remove note to self
* fix as per nulli/wezz000li suggestion
* fix as per nulli/wezz000li suggestion
* resolve merge conflicts
* better use if to silence eslint
* Use `for..of` with `Object.keys` instead of `for..in`
* lint: move setSelection to before call
Co-authored-by: webzwo0i <webzwo0i@c3d2.de>
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
This will make the pages gracefully handle HTTP server restart events,
which happen whenever a plugin is installed or uninstalled via the
`/admin/plugins` page.
* lint: collab-client
* Undo incorrect lint fixes
These will be re-fixed in a future commit.
* Properly fix guard-for-in error
* Properly fix prefer-rest-params errors
* Move some code back to where it was
Moving the code makes it hard to review the diff.
* Delete DISCONNECT_REASON case
Someone reading the code won't understand what "used to handle
appLevelDisconnectReason" means until they dig through the Git
history. Given the server never sends messages of type
DISCONNECT_REASON anyway, just delete the case.
* Refine lint fixes
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
Squashed changes from rhansen@rhansen.org:
* Move code back to where it was. (It's easier to review changes
when the code isn't moved. This causes some no-use-before-define
warnings to reappear, but those are just warnings.)
* Move eslint-disable comment to same line
* Use `window.clientvars` to resolve no-global-assign
* Undo changes that aren't about fixing lint errors
* lint: pluginfw tsort.js
* Don't comment out the `console.log()` call
Disabling the log message is out of scope for the pull request.
* Put const and let on separate lines
* Convert `tsort` from function to arrow function
ESLint doesn't complain about this due to a bug in
prefer-arrow/prefer-arrow-functions rule:
https://github.com/TristonJ/eslint-plugin-prefer-arrow/issues/24
Co-authored-by: Richard Hansen <rhansen@rhansen.org>
* fix accidental write to global variable
properly show pending tests
log test name in suite
better log output for received/expected strings
* cc tests: enable second nestedOL test
* ignore the head tag on import
Missing await in call to this._pad.getInternalRevisionAText(rev). Function returns a promise. This bug breaks the createDiffHTML API call (how I discovered it).
These characters are in the RFC3986 reserved set.
These characters are added to the set of characters that cannot be the
last character of a URL to avoid mislinkification.
It should be the client's responsibility to handle null name or color.
In the case of author names, passing null to the client allows users
to fill in the names of other users (via a suggestUserName
CLIENT_MESSAGE).
When a new client opens a socket.io connection and sends a
CLIENT_READY message, Etherpad sends the new client a bunch of
USER_NEWINFO messages, one per other user already connected to the
pad. When iterating over the other users, filter out those without an
author ID or missing from the global authors database.
Normally I would let `eslint --fix` do this for me, but there's a bug
that causes:
const x = function ()
{
// ...
};
to become:
const x = ()
=> {
// ...
};
which ESLint thinks is a syntax error. (It probably is; I don't know
enough about the automatic semicolon insertion rules to be confident.)
* caching_middleware: fix gzip compression not triggered
* packages: If a client sets `Accept-Encoding: gzip`, the responseCache will
include `Content-Encoding: gzip` in all future responses, even
if a subsequent request does not set `Accept-Encoding` or another client
requests the file without setting `Accept-Encoding`.
Fix that.
* caching_middleware: use `test` instead of `match`
* add tests
* make code easier to understand
* make the regex more clear
* Fix bad paren placement in `/javascript` handler
This fixes a bug introduced in commit
ed5a635f4c.
* add regression test for #4495
* Move `/javascript` test to `specialpages.js`
Co-authored-by: webzwo0i <webzwo0i@c3d2.de>
Some authentication plugins use the users defined in the `users`
object but ignore the `password` and `hash` properties.
This change deletes all of the filtering logic, including the logic
that filters out users that have both `password` and `hash` properties
defined. I could have kept that check, but decided to remove it
because:
* There's no harm in defining both `hash` and `password`.
* Allowing both makes it easier to transition from one scheme to
another.
* It's fewer lines of code to maintain.
If `settings.json` contains a user without a `password` property then
nobody should be able to log in as that user using the built-in HTTP
basic authentication. This is true both with and without this change,
but before this change it wasn't immediately obvious that a malicious
user couldn't use an empty or null password to log in as such a user.
This commit adds an explicit nullish check and some unit tests to
ensure that an empty or null password will not work if the `password`
property is null or undefined.
This makes it possible to disable `contentEditable` for certain
elements in some circumstances (e.g., on links so that users can click
on them normally).
if animationState evaluates to -1 or 0, it would end up in a conditional that assign its value to itself. Since this is redundant, it is better to remove this conditional, to avoid an extra check
Rewrite the `callAll` and `aCallAll` functions to support all
reasonable hook behaviors and to report errors for unreasonable
behaviors (e.g., calling the callback twice).
Now a hook function like the following works as expected when invoked
by `aCallAll`:
```
exports.myHookFn = (hookName, context, cb) => {
cb('some value');
return;
};
```
If a hook function neither calls the callback nor returns a
(non-undefined) value then there's no way for the hook system to know
if/when the hook function has finished.
* Use jQuery to build the message HTML so that special characters in
the error message, URL, etc. are properly escaped. This helps
avoid XSS vulnerabilities.
* Use bold text for the error message to make it stand out.
* Add a line break between the error message and "in <url> at line
<line>" so that the error message stands out more.
* Use `<p>...</p>` instead of `</br>` to separate the parts of the
popup.
* Use CSS for spacing instead of `</br>`.
* Grammar fixes (add a missing comma, "at" instead of "in").
Teach Gritter to accept anything that jQuery's `.append()` method
accepts for the title and text of a popup message. This makes it
easier to safely build HTML messages with proper escaping of special
characters (to prevent XSS vulnerabilities).
The debug statement mostly printed the following useless message over
and over, causing Travis CI logs to become truncated:
[DEBUG] pluginfw - [ undefined ] returning
This will be a breaking change for some people.
We removed all internal password control logic. If this affects you, you have two options:
1. Use a plugin for authentication and use session based pad access (recommended).
1. Use a plugin for password setting.
The reasoning for removing this feature is to reduce the overall security footprint of Etherpad. It is unnecessary and cumbersome to keep this feature and with the thousands of available authentication methods available in the world our focus should be on supporting those and allowing more granual access based on their implementations (instead of half assed baking our own).
We could instead await the results of the hook, but then all callers
and their callers recursively would have to be converted to async, and
that's a huge change.
This currently isn't absolutely necessary because all current callers
of `userCanModify` already check for a read-only pad ID themselves.
However:
* This adds defense in depth.
* This makes it possible to simply replace the import handler's
`allowAnyoneToImport` check with a call to `userCanModify`.
There's no need to perform an authentication check in the socket.io
middleware because `PadMessageHandler.handleMessage` calls
`SecurityMananger.checkAccess` and that now performs authentication
and authorization checks.
This change also improves the user experience: Before, access denials
caused socket.io error events in the client, which `pad.js` mostly
ignores (the user doesn't see anything). Now a deny message is sent
back to the client, which causes `pad.js` to display an obvious
permission denied message.
This also fixes a minor bug: `settings.loadTest` is supposed to bypass
authentication and authorization checks, but they weren't bypassed
because `SecurityManager.checkAccess` did not check
`settings.loadTest`.
Rather than reinvent the wheel, use a well-tested library to parse and
write cookies. This should also help prevent XSS vulnerabilities
because the library handles special characters such as semicolon.
* Use the cookie functions from `pad_utils.js`.
* Delete unused methods, variables, and parameters.
* Simplify the logic.
* Use an ES6 class instead of a weird literal thingy.
* Use `const` instead of `var`.
Previously Etherpad would not pass the correct client IP address through and this caused the rate limiter to limit users behind reverse proxies. This change allows Etherpad to use a client IP passed from a reverse proxy.
Note to devs: This header can be spoofed and spoofing the header could be used in an attack. To mitigate additional *steps should be taken by Etherpad site admins IE doing rate limiting at proxy.* This only really applies to large scale deployments but it's worth noting.
Not every string was localized:
* `/admin/plugins` has some CSS magic to draw the tables of plugins
differently on narrow (mobile) screens, and the l10n library we
use does not support that particular magic. The strings that were
not localized are "Name", "Description", "Version", and "Time".
These strings are only stuck in English when the page is viewed on
a narrow screen; normal desktop users will see translated strings.
The CSS magic ought to be replaced with something more robust
(lots of nested `div`s); those remaining strings can be localized
whenever that happens.
* Strings from external sources such as plugin descriptions, error
messages, and `settings.json` comments are not localized.
Before this change, the authorize hook was invoked twice: once before
authentication and again after (if settings.requireAuthorization is
true). Now pre-authentication authorization is instead handled by a
new preAuthorize hook, and the authorize hook is only invoked after
the user has authenticated.
Rationale: Without this change it is too easy to write an
authorization plugin that is too permissive. Specifically:
* If the plugin does not check the path for /admin then a non-admin
user might be able to access /admin pages.
* If the plugin assumes that the user has already been authenticated
by the time the authorize function is called then unauthenticated
users might be able to gain access to restricted resources.
This change also avoids calling the plugin's authorize function twice
per access, which makes it easier for plugin authors to write an
authorization plugin that is easy to understand.
This change may break existing authorization plugins: After this
change, the authorize hook will no longer be able to authorize
non-admin access to /admin pages. This is intentional. Access to admin
pages should instead be controlled via the `is_admin` user setting,
which can be set in the config file or by an authentication plugin.
Also:
* Add tests for the authenticate and authorize hooks.
* Disable the authentication failure delay when testing.
This loses some of the granularity of the default HTTP basic auth
(unknown username vs. bad password), but there is considerable value
in having logging that is consistent no matter what authentication
plugins are installed.
The export request hook wasn't testing if the pad's id was from a read-only
pad before validating with the pad manager.
This includes an extra step that makes the read-only id verification and also
avoids setting the original pad's id as the file's name.
Commit 0bb8d73ba2 fixed the author ID
that is saved in the socket.io sessioninfo when the client sends a
`CLIENT_READY` with `reconnect` set to true, so it is now safe to undo
the workaround from PR #3868.
Fixes#4331.
Benefits:
* More functions are now async which makes it possible for future
changes to use await in those functions.
* This will help keep the server from drowning in too many messages
if we ever add acknowledgements or if WebSocket backpressure ever
becomes reality.
* This might make tests less flaky because changes triggered by a
message will complete before the Promise resolves.
Before, the author ID was only saved in the session info during the
initial CLIENT_READY, not when the client sent a CLIENT_READY due to a
reconnect. This caused the handling of subsequent messages to use an
undefined author ID.
This makes it possible for reverse proxies to transform 403 errors
into something like "upgrade to a premium account to access this
pad".
Also add some webaccess tests.
Move the handleMessageSecurity and handleMessage hooks after the call
to securityManager.checkAccess.
Benefits:
* A handleMessage plugin can safely assume the message will be
handled unless the plugin itself drops the message, so it doesn't
need to repeat the access checks done by the `handleMessage`
function.
* This paves the way for a future enhancement: pass the author ID to
the hooks.
Note: The handleMessageSecurity hook is broken in several ways:
* The hook result is ignored for `CLIENT_READY` and `SWITCH_TO_PAD`
messages because the `handleClientReady` function overwrites the
hook result. This causes the client to receive client vars with
`readonly` set to true, which causes the client to display an
immutable pad even though the pad is technically writable.
* The formatting toolbar buttons are removed for read-only pads
before the handleMessageSecurity hook even runs.
* It is awkwardly named: Without reading the documentation, how is
one supposed to know that "handle message security" actually means
"grant one-time write access to a read-only pad"?
* It is called for every message even though calls after a
`CLIENT_READY` or `SWITCH_TO_PAD` are mostly pointless.
* Why would anyone want to grant write access when the user visits a
read-only pad URL? The user should just visit the writable pad URL
instead.
* Why would anyone want to grant write access that only lasts for a
single socket.io connection?
* There are better ways to temporarily grant write access (e.g., the
authorize hook).
* This hook is inviting bugs because it breaks a core assumption
about `/p/r.*` URLs.
I think the hook should be deprecated and eventually removed.
A session's sessioninfo could go away asynchronously due to a
disconnect. Grab a reference once and use it throughout the function
to avoid dereferencing a null sessioninfo object.
Where feasible I put the await at the end of the function to
minimize the impact on latency.
My motivation for this change: Eliminate a race condition in tests I
am writing.
* `src/node/server.js` can now be run as a script (for normal
operation) or imported as a module (for tests).
* Move shutdown actions to `src/node/server.js` to be close to the
startup actions.
* Put startup and shutdown in functions so that tests can call them.
* Use `await` instead of callbacks.
* Block until the HTTP server is listening to avoid races during
test startup.
* Add a new `shutdown` hook.
* Use the `shutdown` hook to:
* close the HTTP server
* call `end()` on the stats collection to cancel its timers
* call `terminate()` on the Threads.Pool to stop the workers
* Exit with exit code 0 (instead of 1) on SIGTERM.
* Export the HTTP server so that tests can get the HTTP server's
port via `server.address().port` when `settings.port` is 0.
Avoid dereferencing `DB.db` until it is used so that it is possible to
`require('SessionStore')` before calling `DB.init()`. (This is useful
when writing tests.)
Before this change, `promises.timesLimit()` created `concurrency - 1`
too many promises. The only users of this function use a concurrency
of 500, so this meant that 499 extra promises were created each time
it was used. The bug didn't affect correctness, but it did result in a
large number of unnecessary database operations whenever a pad was
deleted. This change fixes that bug.
Also:
* Convert the function to async and have it resolve after all of the
created promises are resolved.
* Reject concurrency of 0 (unless total is 0).
* Document the function.
* Add tests.
Until now, the "mobile layout" (with right toolbar on bottom of the screen) was displayed only when screen was smaller than 800px. It made the toolbar break for screen about 1000px when a lot of plugins are in the toolbar.
Now instead, we detect with javascript when the toolbar icons overflow the natural space available, and we switch in "mobile layout" in such case
New feature to copy a pad without copying entire history. This is useful to perform a low CPU intensive operation while still copying current pad state.
Before, a malicious user could bypass authorization restrictions
imposed by the authorize hook:
* Step 1: Fetch any resource that the malicious user is authorized to
access (e.g., static content).
* Step 2: Use the signed express_sid cookie generated in step 1 to
create a socket.io connection.
* Step 3: Perform the CLIENT_READY handshake for the desired pad.
* Step 4: Profit!
Now the authorization decision made by the authorize hook is
propagated to SecurityManager so that it can approve or reject
socket.io messages as appropriate.
This also sets up future support for per-user read-only and
modify-only (no create) authorization levels.
If mocha hangs after running the tests, hit Ctrl-C and wtfnode will
print open files, open sockets, running timers, and running intervals.
Adding an `after` function that closes/stops all of those things will
ensure that mocha exits when it finishes running the tests.
Authentication plugins almost always want to read and modify
`settings.users`. The settings can already be accessed in a few other
ways, but this is much more convenient.
The authorization logic determines whether the user has already
successfully authenticated by looking to see if `req.session.user`
exists. If an authentication plugin says that it successfully
authenticated the user but it did not create `req.session.user` then
authentication will re-run for every access, and authorization plugins
will be unable to determine whether the user has been authenticated.
Return a 500 internal server error to prevent these problems.
Before it only logged an error like this:
SyntaxError: Unexpected string in JSON at position XYZ
Now it also logs the filename, making it easier to figure out where
the bad data is:
failed to read file /path/to/etherpad-lite/src/locales/en.json: SyntaxError: Unexpected string in JSON at position XYZ
package to reduce http requests: nice-select,
pad_automatic_reconnect, skin_variants, scroll, caretPosition
rename unorm in tar.json so it can be included
Before, anyone who could create a socket.io connection to Etherpad
could read, modify, and create pads at will without authenticating
first.
The `checkAccess` middleware in `webaccess.js` normally handles
authentication and authorization, but it does not run for `/socket.io`
requests. This means that the connection handler in `socketio.js` must
handle authentication and authorization. However, before this change:
* The handler did not require a signed `express_sid` cookie.
* After loading the express-session state, the handler did not check
to see if the user had authenticated.
Now the handler requires a signed `express_sid` cookie, and it ensures
that `socket.request.session.user` is non-null if authentication is
required. (`socket.request.session.user` is non-null if and only if
the user has authenticated.)
* Move session validity check and session author ID fetch to a
separate function. This separate function can be used by hooks,
making it easier for them to properly determine the author ID.
* Rewrite the remainder of checkAccess. Benefits:
- The function is more readable and maintainable now.
- Vulnerability fix: Before, the session IDs in sessionCookie
were not validated when checking settings.requireSession. Now,
sessionCookie must identify a valid session for the
settings.requireSession test to pass.
- Bug fix: Before, checkAccess would sometimes use the author ID
associated with the token even if sessionCookie identified a
valid session. Now it always uses the author ID associated
with the session if available.